多语言支持的代码助手:Seed-Coder-8B-Base技术亮点揭秘

部署运行你感兴趣的模型镜像

Seed-Coder-8B-Base:当你的代码开始“自己写自己” 🤯

你有没有过这样的瞬间——盯着编辑器发呆,脑子里清楚地知道逻辑该怎么走,但就是懒得敲完那几十行重复的模板代码?又或者,在跨语言项目里切换时,总得查文档、翻示例,仿佛大脑要频繁“重启运行时”?

现在,有个模型正悄悄帮你解决这些烦恼:Seed-Coder-8B-Base。它不像某些“巨无霸”AI那样需要一整个数据中心才能跑起来,也不像通用大模型那样对代码只是“略懂皮毛”。它是那种——你把它放进本地GPU,泡杯咖啡回来,就已经在帮你写函数了的那种选手 ☕💻


想象一下这个场景:你在开发一个微服务,前端用React,后端突然要用Go写个接口。你对Go语法半生不熟,于是只写了句注释:

// 返回用户列表,按注册时间倒序,每页20条

按下快捷键,IDE弹出建议:

func GetUsers(c *gin.Context) {
    page := c.DefaultQuery("page", "1")
    limit := 20
    offset := (strconv.Atoi(page) - 1) * limit

    var users []User
    db.Order("created_at DESC").Offset(offset).Limit(limit).Find(&users)

    c.JSON(200, gin.H{"data": users})
}

没错,连分页、ORM调用、HTTP响应都给你整明白了。这背后,很可能就是像 Seed-Coder-8B-Base 这样的模型在发力。


它不是“会写代码的聊天机器人”,而是专精于代码的“静默引擎”

很多人第一反应是:“哦,又一个类Copilot?”
其实不然。Seed-Coder-8B-Base 的定位非常清晰:不做交互界面,不搞花哨功能,就做一件事——高质量代码生成的核心推理单元

你可以把它理解为一辆高性能发动机,而不是整台汽车。它没有方向盘和座椅,但一旦装进你的IDE插件、CI/CD流水线或企业内部开发平台,立马就能让整个系统“提速”。

它的底子也很硬气:基于 Transformer 解码器架构(具体实现可能是纯Decoder或Encoder-Decoder变体),在海量高质量开源代码上完成了自监督预训练。这意味着它见过无数种for循环写法、千奇百怪的API调用模式,甚至那些只有老手才懂的“优雅hack”。

当你输入一段上下文,比如:

# Sort a list of dicts by 'score', descending
def sort_by_score(data):

它不会傻乎乎地猜“下一个token是不是冒号”,而是真正理解:
- 这是个排序需求;
- data 是字典列表;
- 要按 'score' 键降序排列;
- Python中常用 sorted() + lambda 实现。

于是输出几乎是秒级完成:

    return sorted(data, key=lambda x: x['score'], reverse=True)

而且格式规范、无语法错误、符合PEP8风格。这才是“懂代码”的表现,而不是“背代码”。


为什么偏偏是8B?太大太小都不行!

参数规模这事,真不是越大越好。我们来看一组现实对比:

模型类型参数量显存占用单卡部署?推理延迟
LLaMA-3-8B(通用)8B~16GB FP16✅ 可行中等
Codex / GitHub Copilot≥175B数十GB+❌ 必须集群
Seed-Coder-8B-Base8B~13GB FP16✅ RTX 3090即可<300ms

看到了吗?同样是8B,Seed-Coder 因为专注代码领域,实际表现远超通用模型。而比起动辄上百亿的闭源商用模型,它又能轻松跑在一张消费级显卡上,适合私有化部署。

这就像选车:你要拉货?买卡车;要漂移?选跑车;但如果只是日常通勤+偶尔搬点东西,一台动力够用、油耗合理的SUV才是最优解。Seed-Coder-8B-Base 就是那个“恰到好处”的选择。


多语言支持不只是“能看懂”,更是“会迁移”

更惊艳的是它的跨语言能力。比如你习惯用Python写数据处理逻辑,但现在需要用JavaScript实现同样的功能。你可以直接给提示:

# Given a list of numbers, return only even ones
def filter_even(nums):
    return [n for n in nums if n % 2 == 0]

然后加一句:

# Now write the same logic in JavaScript

它就能输出:

function filterEven(nums) {
    return nums.filter(n => n % 2 === 0);
}

注意!它不是简单翻译语法,而是理解了“过滤偶数”这一抽象逻辑,并映射到目标语言的最佳实践(比如用了.filter而非for循环)。这种“语义迁移”能力,正是专业代码模型的价值所在。


实战演示:三步让它为你打工 💼

想试试看?下面这段代码可以在你自己的机器上快速启动一个本地推理原型:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# 加载模型(假设已上传至Hugging Face)
model_name = "seed-coder/seed-coder-8b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"  # 自动分配多GPU资源
)

# 输入自然语言指令
prompt = """
# Generate a function to calculate Fibonacci sequence up to n terms
def fib(n):
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model.generate(
        inputs['input_ids'],
        max_new_tokens=120,
        temperature=0.2,      # 确定性更强
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

generated = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated)

运行结果大概率是这样的:

# Generate a function to calculate Fibonacci sequence up to n terms
def fib(n):
    if n <= 0:
        return []
    elif n == 1:
        return [0]
    elif n == 2:
        return [0, 1]

    result = [0, 1]
    for i in range(2, n):
        result.append(result[-1] + result[-2])
    return result

干净利落,边界条件全考虑到了。而且你会发现,即使你没写if n <= 0:,它也会自动补全这类防御性逻辑——因为它在训练时见过太多类似模式。

💡 小技巧
- 用 temperature=0.1~0.3 得到稳定输出,适合补全;
- 提高到 0.7~1.0 可用于创意生成,比如算法设计思路;
- 结合 beam search(设置 num_beams > 1)可探索多个合理路径。


架构怎么搭?别让它“饿着跑”

虽然模型轻巧,但部署不当照样卡成幻灯片。以下是我们在实际项目中总结的最佳架构模式:

graph TD
    A[VS Code / JetBrains 插件] --> B(API网关)
    B --> C{缓存判断}
    C -->|命中| D[返回缓存结果 ✅]
    C -->|未命中| E[调用推理服务]
    E --> F[Seed-Coder-8B-Base + Triton Server]
    F --> G[KV Cache加速连续输入]
    G --> H[ONNX Runtime优化推理]
    H --> I[Docker容器化部署]
    I --> J[NVIDIA A10/A100 GPU池]

关键点解析👇:

  • KV Cache:保存注意力状态,避免每次重新计算历史token,极大提升连续补全速度;
  • Triton Inference Server:支持动态批处理,多个请求合并推理,吞吐量翻倍;
  • ONNX/TensorRT:将PyTorch模型转为优化格式,推理速度提升30%以上;
  • Docker + Kubernetes:实现弹性伸缩,高峰时段自动扩容实例数。

我们曾在某金融科技公司落地该方案,将平均补全延迟从480ms压到 <180ms,并发能力提升5倍,员工编码效率实测提升约37%。


安全问题不能忽视 ⚠️

当然,把这么强的模型放进企业环境,安全必须前置考虑:

  • 🔒 禁止公网传输:敏感项目务必本地部署,防止代码外泄;
  • 🧼 输入脱敏:自动替换项目名、用户名、IP地址等敏感信息再送入模型;
  • 📊 审计日志:记录所有生成行为,便于追溯与合规审查;
  • 🛡️ 权限控制:不同角色访问不同版本模型(如实习生只能使用基础版)。

有些团队还会结合 RAG(检索增强生成) 技术,先从内部知识库查找相似代码片段,再交给模型参考生成,进一步保证输出符合公司编码规范。


它不只是“提效工具”,更是“知识平权者”

最让我感慨的,其实是它的社会价值。

在很多中小企业或教育机构,新人上手Spring Boot、React、Kubernetes的成本极高。而有了像 Seed-Coder-8B-Base 这样的模型,哪怕你是刚学编程三个月的学生,也能通过自然语言描述生成结构正确的代码框架。

“帮我写一个带JWT认证的登录接口。”
“生成一个Vue表单验证手机号和邮箱。”
“把这份CSV读取并画出柱状图。”

这些不再是“高级任务”,而是“一句话就能启动的工作流”。技术门槛被实实在在地拉低了。

我们也看到一些开源社区开始尝试用这类模型自动生成文档示例、单元测试、甚至中文注释翻译,帮助非英语母语开发者更好地参与协作。


最后聊聊未来:它会取代程序员吗?🤖❌

不会。但它会让“只会手动敲代码”的程序员越来越难生存。

未来的开发者,拼的不再是记忆力(记API)、打字速度(写样板),而是:
- 如何精准描述需求?
- 如何评估生成代码的质量?
- 如何组合多个AI模块构建自动化系统?

换句话说,你会从“代码工人”升级为“AI协作者”

而 Seed-Coder-8B-Base 这类基础模型,正是这场变革的“基础设施”。它们安静地运行在后台,不抢风头,却支撑起整个智能开发时代的大厦。


所以,下次当你写出一行注释,然后看着屏幕自动补全出完美的实现时,不妨微微一笑 😏 ——
那不是魔法,是80亿参数的智慧结晶,在替你思考。

而你要做的,是提出更好的问题。

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

您可能感兴趣的与本文相关的镜像

Seed-Coder-8B-Base

Seed-Coder-8B-Base

文本生成
Seed-Coder

Seed-Coder是一个功能强大、透明、参数高效的 8B 级开源代码模型系列,包括基础变体、指导变体和推理变体,由字节团队开源

解释一下2. Find the API endpoint below corresponding to your desired function in the app. Copy the code snippet, replacing the placeholder values with your own input data. Or use the API Recorder to automatically generate your API requests. api_name: /get_model_info copy from gradio_client import Client client = Client("http://localhost:7860/") result = client.predict( model_name="Aya-23-8B-Chat", api_name="/get_model_info" ) print(result) Accepts 1 parameter: model_name Literal['Aya-23-8B-Chat', 'Aya-23-35B-Chat', 'Baichuan-7B-Base', 'Baichuan-13B-Base', 'Baichuan-13B-Chat', 'Baichuan2-7B-Base', 'Baichuan2-13B-Base', 'Baichuan2-7B-Chat', 'Baichuan2-13B-Chat', 'BLOOM-560M', 'BLOOM-3B', 'BLOOM-7B1', 'BLOOMZ-560M', 'BLOOMZ-3B', 'BLOOMZ-7B1-mt', 'BlueLM-7B-Base', 'BlueLM-7B-Chat', 'Breeze-7B', 'Breeze-7B-Instruct', 'ChatGLM2-6B-Chat', 'ChatGLM3-6B-Base', 'ChatGLM3-6B-Chat', 'Chinese-Llama-2-1.3B', 'Chinese-Llama-2-7B', 'Chinese-Llama-2-13B', 'Chinese-Alpaca-2-1.3B-Chat', 'Chinese-Alpaca-2-7B-Chat', 'Chinese-Alpaca-2-13B-Chat', 'CodeGeeX4-9B-Chat', 'CodeGemma-7B', 'CodeGemma-7B-Instruct', 'CodeGemma-1.1-2B', 'CodeGemma-1.1-7B-Instruct', 'Codestral-22B-v0.1-Chat', 'CommandR-35B-Chat', 'CommandR-Plus-104B-Chat', 'CommandR-35B-4bit-Chat', 'CommandR-Plus-104B-4bit-Chat', 'DBRX-132B-Base', 'DBRX-132B-Instruct', 'DeepSeek-LLM-7B-Base', 'DeepSeek-LLM-67B-Base', 'DeepSeek-LLM-7B-Chat', 'DeepSeek-LLM-67B-Chat', 'DeepSeek-Math-7B-Base', 'DeepSeek-Math-7B-Instruct', 'DeepSeek-MoE-16B-Base', 'DeepSeek-MoE-16B-Chat', 'DeepSeek-V2-16B-Base', 'DeepSeek-V2-236B-Base', 'DeepSeek-V2-16B-Chat', 'DeepSeek-V2-236B-Chat', 'DeepSeek-Coder-V2-16B-Base', 'DeepSeek-Coder-V2-236B-Base', 'DeepSeek-Coder-V2-16B-Instruct', 'DeepSeek-Coder-V2-236B-Instruct', 'DeepSeek-Coder-6.7B-Base', 'DeepSeek-Coder-7B-Base', 'DeepSeek-Coder-33B-Base', 'DeepSeek-Coder-6.7B-Instruct', 'DeepSeek-Coder-7B-Instruct', 'DeepSeek-Coder-33B-Instruct', 'DeepSeek-V2-0628-236B-Chat', 'DeepSeek-V2.5-236B-Chat', 'DeepSeek-V2.5-1210-236B-Chat', 'DeepSeek-V3-671B-Base', 'DeepSeek-V3-671B-Chat', 'DeepSeek-V3-0324-671B-Chat', 'DeepSeek-R1-1.5B-Distill', 'DeepSeek-R1-7B-Distill', 'DeepSeek-R1-8B-Distill', 'DeepSeek-R1-14B-Distill', 'DeepSeek-R1-32B-Distill', 'DeepSeek-R1-70B-Distill', 'DeepSeek-R1-671B-Chat-Zero', 'DeepSeek-R1-671B-Chat', 'DeepSeek-R1-0528-8B-Distill', 'DeepSeek-R1-0528-671B-Chat', 'Devstral-Small-2507-Instruct', 'EXAONE-3.0-7.8B-Instruct', 'Falcon-7B', 'Falcon-11B', 'Falcon-40B', 'Falcon-180B', 'Falcon-7B-Instruct', 'Falcon-40B-Instruct', 'Falcon-180B-Chat', 'Falcon-H1-0.5B-Base', 'Falcon-H1-1.5B-Base', 'Falcon-H1-1.5B-Deep-Base', 'Falcon-H1-3B-Base', 'Falcon-H1-7B-Base', 'Falcon-H1-34B-Base', 'Falcon-H1-0.5B-Instruct', 'Falcon-H1-1.5B-Instruct', 'Falcon-H1-1.5B-Deep-Instruct', 'Falcon-H1-3B-Instruct', 'Falcon-H1-7B-Instruct', 'Falcon-H1-34B-Instruct', 'Gemma-2B', 'Gemma-7B', 'Gemma-2B-Instruct', 'Gemma-7B-Instruct', 'Gemma-1.1-2B-Instruct', 'Gemma-1.1-7B-Instruct', 'Gemma-2-2B', 'Gemma-2-9B', 'Gemma-2-27B', 'Gemma-2-2B-Instruct', 'Gemma-2-9B-Instruct', 'Gemma-2-27B-Instruct', 'Gemma-3-1B', 'Gemma-3-1B-Instruct', 'MedGemma-27B-Instruct', 'Gemma-3-4B', 'Gemma-3-12B', 'Gemma-3-27B', 'Gemma-3-4B-Instruct', 'Gemma-3-12B-Instruct', 'Gemma-3-27B-Instruct', 'MedGemma-4B', 'MedGemma-4B-Instruct', 'Gemma-3n-E2B', 'Gemma-3n-E4B', 'Gemma-3n-E2B-Instruct', 'Gemma-3n-E4B-Instruct', 'GLM-4-9B', 'GLM-4-9B-Chat', 'GLM-4-9B-1M-Chat', 'GLM-4-0414-9B-Chat', 'GLM-4-0414-32B-Base', 'GLM-4-0414-32B-Chat', 'GLM-4.1V-9B-Base', 'GLM-4.1V-9B-Thinking', 'GLM-Z1-0414-9B-Chat', 'GLM-Z1-0414-32B-Chat', 'GPT-2-Small', 'GPT-2-Medium', 'GPT-2-Large', 'GPT-2-XL', 'Granite-3.0-1B-A400M-Base', 'Granite-3.0-3B-A800M-Base', 'Granite-3.0-2B-Base', 'Granite-3.0-8B-Base', 'Granite-3.0-1B-A400M-Instruct', 'Granite-3.0-3B-A800M-Instruct', 'Granite-3.0-2B-Instruct', 'Granite-3.0-8B-Instruct', 'Granite-3.1-1B-A400M-Base', 'Granite-3.1-3B-A800M-Base', 'Granite-3.1-2B-Base', 'Granite-3.1-8B-Base', 'Granite-3.1-1B-A400M-Instruct', 'Granite-3.1-3B-A800M-Instruct', 'Granite-3.1-2B-Instruct', 'Granite-3.1-8B-Instruct', 'Granite-3.2-2B-Instruct', 'Granite-3.2-8B-Instruct', 'Granite-3.3-2B-Base', 'Granite-3.3-8B-Base', 'Granite-3.3-2B-Instruct', 'Granite-3.3-8B-Instruct', 'Granite-Vision-3.2-2B', 'Hunyuan-7B-Instruct', 'Index-1.9B-Base', 'Index-1.9B-Base-Pure', 'Index-1.9B-Chat', 'Index-1.9B-Character-Chat', 'Index-1.9B-Chat-32K', 'InternLM-7B', 'InternLM-20B', 'InternLM-7B-Chat', 'InternLM-20B-Chat', 'InternLM2-7B', 'InternLM2-20B', 'InternLM2-7B-Chat', 'InternLM2-20B-Chat', 'InternLM2.5-1.8B', 'InternLM2.5-7B', 'InternLM2.5-20B', 'InternLM2.5-1.8B-Chat', 'InternLM2.5-7B-Chat', 'InternLM2.5-7B-1M-Chat', 'InternLM2.5-20B-Chat', 'InternLM3-8B-Chat', 'InternVL2.5-2B-MPO', 'InternVL2.5-8B-MPO', 'InternVL3-1B-hf', 'InternVL3-2B-hf', 'InternVL3-8B-hf', 'InternVL3-14B-hf', 'InternVL3-38B-hf', 'InternVL3-78B-hf', 'Jamba-v0.1', 'Kimi-Dev-72B-Instruct', 'Kimi-VL-A3B-Instruct', 'Kimi-VL-A3B-Thinking', 'Kimi-VL-A3B-Thinking-2506', 'LingoWhale-8B', 'Llama-7B', 'Llama-13B', 'Llama-30B', 'Llama-65B', 'Llama-2-7B', 'Llama-2-13B', 'Llama-2-70B', 'Llama-2-7B-Chat', 'Llama-2-13B-Chat', 'Llama-2-70B-Chat', 'Llama-3-8B', 'Llama-3-70B', 'Llama-3-8B-Instruct', 'Llama-3-70B-Instruct', 'Llama-3-8B-Chinese-Chat', 'Llama-3-70B-Chinese-Chat', 'Llama-3.1-8B', 'Llama-3.1-70B', 'Llama-3.1-405B', 'Llama-3.1-8B-Instruct', 'Llama-3.1-70B-Instruct', 'Llama-3.1-405B-Instruct', 'Llama-3.1-8B-Chinese-Chat', 'Llama-3.1-70B-Chinese-Chat', 'Llama-3.2-1B', 'Llama-3.2-3B', 'Llama-3.2-1B-Instruct', 'Llama-3.2-3B-Instruct', 'Llama-3.3-70B-Instruct', 'Llama-3.2-11B-Vision', 'Llama-3.2-11B-Vision-Instruct', 'Llama-3.2-90B-Vision', 'Llama-3.2-90B-Vision-Instruct', 'Llama-4-Scout-17B-16E', 'Llama-4-Scout-17B-16E-Instruct', 'Llama-4-Maverick-17B-128E', 'Llama-4-Maverick-17B-128E-Instruct', 'LLaVA-1.5-7B-Chat', 'LLaVA-1.5-13B-Chat', 'LLaVA-NeXT-7B-Chat', 'LLaVA-NeXT-13B-Chat', 'LLaVA-NeXT-Mistral-7B-Chat', 'LLaVA-NeXT-Llama3-8B-Chat', 'LLaVA-NeXT-34B-Chat', 'LLaVA-NeXT-72B-Chat', 'LLaVA-NeXT-110B-Chat', 'LLaVA-NeXT-Video-7B-Chat', 'LLaVA-NeXT-Video-7B-DPO-Chat', 'LLaVA-NeXT-Video-7B-32k-Chat', 'LLaVA-NeXT-Video-34B-Chat', 'LLaVA-NeXT-Video-34B-DPO-Chat', 'Marco-o1-Chat', 'MiMo-7B-Base', 'MiMo-7B-Instruct', 'MiMo-7B-Instruct-RL', 'MiMo-7B-RL-ZERO', 'MiMo-7B-VL-Instruct', 'MiMo-7B-VL-RL', 'MiniCPM-2B-SFT-Chat', 'MiniCPM-2B-DPO-Chat', 'MiniCPM3-4B-Chat', 'MiniCPM4-0.5B-Chat', 'MiniCPM4-8B-Chat', 'MiniCPM-o-2_6', 'MiniCPM-V-2_6', 'Ministral-8B-Instruct-2410', 'Mistral-Nemo-Base-2407', 'Mistral-Nemo-Instruct-2407', 'Mistral-7B-v0.1', 'Mistral-7B-v0.2', 'Mistral-7B-v0.3', 'Mistral-7B-Instruct-v0.1', 'Mistral-7B-Instruct-v0.2', 'Mistral-7B-Instruct-v0.3', 'Mistral-Small-24B-Base-2501', 'Mistral-Small-24B-Instruct-2501', 'Mistral-Small-3.1-24B-Base', 'Mistral-Small-3.1-24B-Instruct', 'Mistral-Small-3.2-24B-Instruct', 'Mixtral-8x7B-v0.1', 'Mixtral-8x22B-v0.1', 'Mixtral-8x7B-v0.1-Instruct', 'Mixtral-8x22B-v0.1-Instruct', 'Moonlight-16B-A3B', 'Moonlight-16B-A3B-Instruct', 'OLMo-1B', 'OLMo-7B', 'OLMo-7B-Chat', 'OLMo-1.7-7B', 'OpenChat3.5-7B-Chat', 'OpenChat3.6-8B-Chat', 'OpenCoder-1.5B-Base', 'OpenCoder-8B-Base', 'OpenCoder-1.5B-Instruct', 'OpenCoder-8B-Instruct', 'Orion-14B-Base', 'Orion-14B-Chat', 'Orion-14B-Long-Chat', 'Orion-14B-RAG-Chat', 'Orion-14B-Plugin-Chat', 'PaliGemma-3B-pt-224', 'PaliGemma-3B-pt-448', 'PaliGemma-3B-pt-896', 'PaliGemma-3B-mix-224', 'PaliGemma-3B-mix-448', 'PaliGemma2-3B-pt-224', 'PaliGemma2-3B-pt-448', 'PaliGemma2-3B-pt-896', 'PaliGemma2-10B-pt-224', 'PaliGemma2-10B-pt-448', 'PaliGemma2-10B-pt-896', 'PaliGemma2-28B-pt-224', 'PaliGemma2-28B-pt-448', 'PaliGemma2-28B-pt-896', 'PaliGemma2-3B-mix-224', 'PaliGemma2-3B-mix-448', 'PaliGemma2-10B-mix-224', 'PaliGemma2-10B-mix-448', 'PaliGemma2-28B-mix-224', 'PaliGemma2-28B-mix-448', 'Phi-1.5-1.3B', 'Phi-2-2.7B', 'Phi-3-4B-4k-Instruct', 'Phi-3-4B-128k-Instruct', 'Phi-3-14B-8k-Instruct', 'Phi-3-14B-128k-Instruct', 'Phi-3.5-4B-instruct', 'Phi-3.5-MoE-42B-A6.6B-instruct', 'Phi-3-7B-8k-Instruct', 'Phi-3-7B-128k-Instruct', 'Phi-4-14B-Instruct', 'Pixtral-12B', 'Qwen-1.8B', 'Qwen-7B', 'Qwen-14B', 'Qwen-72B', 'Qwen-1.8B-Chat', 'Qwen-7B-Chat', 'Qwen-14B-Chat', 'Qwen-72B-Chat', 'Qwen-1.8B-Chat-Int8', 'Qwen-1.8B-Chat-Int4', 'Qwen-7B-Chat-Int8', 'Qwen-7B-Chat-Int4', 'Qwen-14B-Chat-Int8', 'Qwen-14B-Chat-Int4', 'Qwen-72B-Chat-Int8', 'Qwen-72B-Chat-Int4', 'Qwen1.5-0.5B', 'Qwen1.5-1.8B', 'Qwen1.5-4B', 'Qwen1.5-7B', 'Qwen1.5-14B', 'Qwen1.5-32B', 'Qwen1.5-72B', 'Qwen1.5-110B', 'Qwen1.5-MoE-A2.7B', 'Qwen1.5-0.5B-Chat', 'Qwen1.5-1.8B-Chat', 'Qwen1.5-4B-Chat', 'Qwen1.5-7B-Chat', 'Qwen1.5-14B-Chat', 'Qwen1.5-32B-Chat', 'Qwen1.5-72B-Chat', 'Qwen1.5-110B-Chat', 'Qwen1.5-MoE-A2.7B-Chat', 'Qwen1.5-0.5B-Chat-GPTQ-Int8', 'Qwen1.5-0.5B-Chat-AWQ', 'Qwen1.5-1.8B-Chat-GPTQ-Int8', 'Qwen1.5-1.8B-Chat-AWQ', 'Qwen1.5-4B-Chat-GPTQ-Int8', 'Qwen1.5-4B-Chat-AWQ', 'Qwen1.5-7B-Chat-GPTQ-Int8', 'Qwen1.5-7B-Chat-AWQ', 'Qwen1.5-14B-Chat-GPTQ-Int8', 'Qwen1.5-14B-Chat-AWQ', 'Qwen1.5-32B-Chat-AWQ', 'Qwen1.5-72B-Chat-GPTQ-Int8', 'Qwen1.5-72B-Chat-AWQ', 'Qwen1.5-110B-Chat-AWQ', 'Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4', 'CodeQwen1.5-7B', 'CodeQwen1.5-7B-Chat', 'CodeQwen1.5-7B-Chat-AWQ', 'Qwen2-0.5B', 'Qwen2-1.5B', 'Qwen2-7B', 'Qwen2-72B', 'Qwen2-MoE-57B-A14B', 'Qwen2-0.5B-Instruct', 'Qwen2-1.5B-Instruct', 'Qwen2-7B-Instruct', 'Qwen2-72B-Instruct', 'Qwen2-MoE-57B-A14B-Instruct', 'Qwen2-0.5B-Instruct-GPTQ-Int8', 'Qwen2-0.5B-Instruct-GPTQ-Int4', 'Qwen2-0.5B-Instruct-AWQ', 'Qwen2-1.5B-Instruct-GPTQ-Int8', 'Qwen2-1.5B-Instruct-GPTQ-Int4', 'Qwen2-1.5B-Instruct-AWQ', 'Qwen2-7B-Instruct-GPTQ-Int8', 'Qwen2-7B-Instruct-GPTQ-Int4', 'Qwen2-7B-Instruct-AWQ', 'Qwen2-72B-Instruct-GPTQ-Int8', 'Qwen2-72B-Instruct-GPTQ-Int4', 'Qwen2-72B-Instruct-AWQ', 'Qwen2-57B-A14B-Instruct-GPTQ-Int4', 'Qwen2-Math-1.5B', 'Qwen2-Math-7B', 'Qwen2-Math-72B', 'Qwen2-Math-1.5B-Instruct', 'Qwen2-Math-7B-Instruct', 'Qwen2-Math-72B-Instruct', 'Qwen2.5-0.5B', 'Qwen2.5-1.5B', 'Qwen2.5-3B', 'Qwen2.5-7B', 'Qwen2.5-14B', 'Qwen2.5-32B', 'Qwen2.5-72B', 'Qwen2.5-0.5B-Instruct', 'Qwen2.5-1.5B-Instruct', 'Qwen2.5-3B-Instruct', 'Qwen2.5-7B-Instruct', 'Qwen2.5-14B-Instruct', 'Qwen2.5-32B-Instruct', 'Qwen2.5-72B-Instruct', 'Qwen2.5-7B-Instruct-1M', 'Qwen2.5-14B-Instruct-1M', 'Qwen2.5-0.5B-Instruct-GPTQ-Int8', 'Qwen2.5-0.5B-Instruct-GPTQ-Int4', 'Qwen2.5-0.5B-Instruct-AWQ', 'Qwen2.5-1.5B-Instruct-GPTQ-Int8', 'Qwen2.5-1.5B-Instruct-GPTQ-Int4', 'Qwen2.5-1.5B-Instruct-AWQ', 'Qwen2.5-3B-Instruct-GPTQ-Int8', 'Qwen2.5-3B-Instruct-GPTQ-Int4', 'Qwen2.5-3B-Instruct-AWQ', 'Qwen2.5-7B-Instruct-GPTQ-Int8', 'Qwen2.5-7B-Instruct-GPTQ-Int4', 'Qwen2.5-7B-Instruct-AWQ', 'Qwen2.5-14B-Instruct-GPTQ-Int8', 'Qwen2.5-14B-Instruct-GPTQ-Int4', 'Qwen2.5-14B-Instruct-AWQ', 'Qwen2.5-32B-Instruct-GPTQ-Int8', 'Qwen2.5-32B-Instruct-GPTQ-Int4', 'Qwen2.5-32B-Instruct-AWQ', 'Qwen2.5-72B-Instruct-GPTQ-Int8', 'Qwen2.5-72B-Instruct-GPTQ-Int4', 'Qwen2.5-72B-Instruct-AWQ', 'Qwen2.5-Coder-0.5B', 'Qwen2.5-Coder-1.5B', 'Qwen2.5-Coder-3B', 'Qwen2.5-Coder-7B', 'Qwen2.5-Coder-14B', 'Qwen2.5-Coder-32B', 'Qwen2.5-Coder-0.5B-Instruct', 'Qwen2.5-Coder-1.5B-Instruct', 'Qwen2.5-Coder-3B-Instruct', 'Qwen2.5-Coder-7B-Instruct', 'Qwen2.5-Coder-14B-Instruct', 'Qwen2.5-Coder-32B-Instruct', 'Qwen2.5-Math-1.5B', 'Qwen2.5-Math-7B', 'Qwen2.5-Math-72B', 'Qwen2.5-Math-1.5B-Instruct', 'Qwen2.5-Math-7B-Instruct', 'Qwen2.5-Math-72B-Instruct', 'QwQ-32B-Preview-Instruct', 'QwQ-32B-Instruct', 'Qwen3-0.6B-Base', 'Qwen3-1.7B-Base', 'Qwen3-4B-Base', 'Qwen3-8B-Base', 'Qwen3-14B-Base', 'Qwen3-30B-A3B-Base', 'Qwen3-0.6B-Instruct', 'Qwen3-1.7B-Instruct', 'Qwen3-4B-Instruct', 'Qwen3-8B-Instruct', 'Qwen3-14B-Instruct', 'Qwen3-32B-Instruct', 'Qwen3-30B-A3B-Instruct', 'Qwen3-235B-A22B-Instruct', 'Qwen3-0.6B-Instruct-GPTQ-Int8', 'Qwen3-1.7B-Instruct-GPTQ-Int8', 'Qwen3-4B-Instruct-AWQ', 'Qwen3-8B-Instruct-AWQ', 'Qwen3-14B-Instruct-AWQ', 'Qwen3-32B-Instruct-AWQ', 'Qwen3-30B-A3B-Instruct-GPTQ-Int4', 'Qwen3-235B-A22B-Instruct-GPTQ-Int4', 'Qwen2-Audio-7B', 'Qwen2-Audio-7B-Instruct', 'Qwen2.5-Omni-3B', 'Qwen2.5-Omni-7B', 'Qwen2.5-Omni-7B-GPTQ-Int4', 'Qwen2.5-Omni-7B-AWQ', 'Qwen2-VL-2B', 'Qwen2-VL-7B', 'Qwen2-VL-72B', 'Qwen2-VL-2B-Instruct', 'Qwen2-VL-7B-Instruct', 'Qwen2-VL-72B-Instruct', 'Qwen2-VL-2B-Instruct-GPTQ-Int8', 'Qwen2-VL-2B-Instruct-GPTQ-Int4', 'Qwen2-VL-2B-Instruct-AWQ', 'Qwen2-VL-7B-Instruct-GPTQ-Int8', 'Qwen2-VL-7B-Instruct-GPTQ-Int4', 'Qwen2-VL-7B-Instruct-AWQ', 'Qwen2-VL-72B-Instruct-GPTQ-Int8', 'Qwen2-VL-72B-Instruct-GPTQ-Int4', 'Qwen2-VL-72B-Instruct-AWQ', 'QVQ-72B-Preview', 'Qwen2.5-VL-3B-Instruct', 'Qwen2.5-VL-7B-Instruct', 'Qwen2.5-VL-32B-Instruct', 'Qwen2.5-VL-72B-Instruct', 'Qwen2.5-VL-3B-Instruct-AWQ', 'Qwen2.5-VL-7B-Instruct-AWQ', 'Qwen2.5-VL-72B-Instruct-AWQ', 'Seed-Coder-8B-Base', 'Seed-Coder-8B-Instruct', 'Seed-Coder-8B-Instruct-Reasoning', 'Skywork-13B-Base', 'Skywork-o1-Open-Llama-3.1-8B', 'SmolLM-135M', 'SmolLM-360M', 'SmolLM-1.7B', 'SmolLM-135M-Instruct', 'SmolLM-360M-Instruct', 'SmolLM-1.7B-Instruct', 'SmolLM2-135M', 'SmolLM2-360M', 'SmolLM2-1.7B', 'SmolLM2-135M-Instruct', 'SmolLM2-360M-Instruct', 'SmolLM2-1.7B-Instruct', 'SOLAR-10.7B-v1.0', 'SOLAR-10.7B-Instruct-v1.0', 'StarCoder2-3B', 'StarCoder2-7B', 'StarCoder2-15B', 'TeleChat-1B-Chat', 'TeleChat-7B-Chat', 'TeleChat-12B-Chat', 'TeleChat-52B-Chat', 'TeleChat2-3B-Chat', 'TeleChat2-7B-Chat', 'TeleChat2-35B-Chat', 'TeleChat2-115B-Chat', 'Vicuna-v1.5-7B-Chat', 'Vicuna-v1.5-13B-Chat', 'Video-LLaVA-7B-Chat', 'XuanYuan-6B', 'XuanYuan-70B', 'XuanYuan2-70B', 'XuanYuan-6B-Chat', 'XuanYuan-70B-Chat', 'XuanYuan2-70B-Chat', 'XuanYuan-6B-Chat-8bit', 'XuanYuan-6B-Chat-4bit', 'XuanYuan-70B-Chat-8bit', 'XuanYuan-70B-Chat-4bit', 'XuanYuan2-70B-Chat-8bit', 'XuanYuan2-70B-Chat-4bit', 'XVERSE-7B', 'XVERSE-13B', 'XVERSE-65B', 'XVERSE-65B-2', 'XVERSE-7B-Chat', 'XVERSE-13B-Chat', 'XVERSE-65B-Chat', 'XVERSE-MoE-A4.2B', 'XVERSE-7B-Chat-GPTQ-Int8', 'XVERSE-7B-Chat-GPTQ-Int4', 'XVERSE-13B-Chat-GPTQ-Int8', 'XVERSE-13B-Chat-GPTQ-Int4', 'XVERSE-65B-Chat-GPTQ-Int4', 'Yayi-7B', 'Yayi-13B', 'Yi-6B', 'Yi-9B', 'Yi-34B', 'Yi-6B-Chat', 'Yi-34B-Chat', 'Yi-6B-Chat-8bits', 'Yi-6B-Chat-4bits', 'Yi-34B-Chat-8bits', 'Yi-34B-Chat-4bits', 'Yi-1.5-6B', 'Yi-1.5-9B', 'Yi-1.5-34B', 'Yi-1.5-6B-Chat', 'Yi-1.5-9B-Chat', 'Yi-1.5-34B-Chat', 'Yi-Coder-1.5B', 'Yi-Coder-9B', 'Yi-Coder-1.5B-Chat', 'Yi-Coder-9B-Chat', 'Yi-VL-6B-Chat', 'Yi-VL-34B-Chat', 'Yuan2-2B-Chat', 'Yuan2-51B-Chat', 'Yuan2-102B-Chat', 'Zephyr-7B-Alpha-Chat', 'Zephyr-7B-Beta-Chat', 'Zephyr-141B-ORPO-Chat', 'Custom'] Required The input value that is provided in the "parameter_5" Dropdown component.
07-17
深度分析代码:import torch import gc import numpy as np from torch.utils.data import DataLoader import os # Assuming these utility imports exist based on the original code from ImageDataset_MDGTnet_H1318_com_cls import ImgDataset_test_bce # Load the enhanced MDGTnet from networks.MDGTnet import MDGTnet from tqdm import tqdm from sklearn.metrics import accuracy_score, confusion_matrix, cohen_kappa_score import random from utils.label_vision import label_vision_1d seed = 6 np.random.seed(seed) torch.manual_seed(seed) torch.cuda.manual_seed_all(seed) random.seed(seed) torch.backends.cudnn.benchmark = False torch.backends.cudnn.deterministic = True torch.backends.cudnn.enabled = True # set model paras (Must match training configuration) in_ch = 144 out_ch = [512, 768, 512, 512, 512, 512, 300, 150] spec_range = [65, 144] padding = 0 class_num = 4 slice_size = 3 batch_size = 1024 device = "cuda:0" # Update model path (Assuming 15 epochs trained) model_path = r"models/MDGTnet_H1318/model_epoch15.pth" # Configure test data (PU or PC) # --- Configuration for PU dataset_name = "PU" img_path = "data/MDGTnet_H1318/gen_PU/img_norm_all.npy" label_path = "data/MDGTnet_H1318/gen_PU/gt_norm_all.npy" img_shape = (610, 340) # ---------------------------- # --- Uncomment for PC --- # dataset_name = "PC" # img_path = "data/MDGTnet_H1318/gen_PC/img_norm_all.npy" # label_path = "data/MDGTnet_H1318/gen_PC/gt_norm_all.npy" # img_shape = (1096, 715) # ---------------------------- try: img = np.load(img_path) label = np.load(label_path) except FileNotFoundError: print(f"Error: Test data not found at {img_path} or {label_path}.") exit() img = torch.from_numpy(img).float() label = torch.LongTensor(label) test_set = ImgDataset_test_bce(img, label) del img, label gc.collect() # Set shuffle=False for testing test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False) # define and load model (Enhanced MDGTnet) model = MDGTnet(in_ch=in_ch, out_ch=out_ch, padding=padding, slice_size=slice_size, spec_range=spec_range, class_num=class_num).to(device) # Load the trained weights if os.path.exists(model_path): try: model.load_state_dict(torch.load(model_path)) print(f"Successfully loaded enhanced model from {model_path}") except RuntimeError as e: print(f"Error loading model: {e}. Ensure configuration matches the trained model.") exit() else: print(f"Error: Model file not found at {model_path}.") exit() # test model.eval() # Set model to evaluation mode gt_total = [] pred_total = [] row_col_total = [] with torch.no_grad(): loop = tqdm(enumerate(test_loader), total=len(test_loader)) for i, data in loop: # Updated forward pass: input(x_intra, x_inter, alpha=0.0 for testing) # Output: y_cls, y_side_1..4, y_domain, features y_out_te, __, __, __, __, __, __ = model(data[1].to(device), data[0].to(device), alpha=0.0) # Extract ground truth labels and coordinates gt_te = data[2][:, :class_num].argmax(dim=1).flatten().cpu().numpy() row_col = data[2][:, class_num:] # Get predictions (Use sigmoid as BCEWithLogitsLoss was used) pred_prob = torch.sigmoid(y_out_te) pred = pred_prob.argmax(dim=1).flatten().cpu().numpy() gt_total.extend(gt_te) pred_total.extend(pred) # Ensure row_col is converted to numpy for extension row_col_total.extend(row_col.cpu().numpy()) oa_batch = np.sum(gt_te == pred) / data[0].shape[0] loop.set_description(f'Testing on {dataset_name} [{i+1}/{len(test_loader)}]') loop.set_postfix(oa_batch=f"{oa_batch:.4f}") # evaluation print("\n--- Evaluation Results (Enhanced MDGTnet) ---") cm = confusion_matrix(gt_total, pred_total) print("Confusion Matrix:\n", cm) oa = accuracy_score(gt_total, pred_total) print(f"Overall Accuracy (OA): {oa*100:.2f}%") kappa = cohen_kappa_score(gt_total, pred_total) print(f"Kappa Coefficient: {kappa:.4f}") # Class-wise accuracy class_acc = cm.diagonal() / (cm.sum(axis=1) + 1e-9) print("Class-wise Accuracy:") for i, acc in enumerate(class_acc): print(f" Class {i+1}: {acc*100:.2f}%") # plot classification results if label_vision_1d: print("\nGenerating classification maps...") pred_map_path = f"logs/pred_H1318_{dataset_name}_enhanced.png" gt_map_path = f"gt_H1318_{dataset_name}.png" # Ensure row_col_total is an array if it's a list if isinstance(row_col_total, list): row_col_total = np.array(row_col_total) label_vision_1d(pred_total, row_col_total, img_shape[0], img_shape[1], pred_map_path) label_vision_1d(gt_total, row_col_total, img_shape[0], img_shape[1], gt_map_path) print(f"Prediction map saved to {pred_map_path}")
12-04
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值