用SGLang部署的DeepSeek R1推理时没有输出＜think＞标签

最新推荐文章于 2025-05-04 23:52:44 发布

Elwin Wong

最新推荐文章于 2025-05-04 23:52:44 发布

阅读量2.2k

点赞数 7

分类专栏：大模型文章标签：人工智能大模型 LLM DeepSeek-R1 SGLang

本文链接：https://blog.csdn.net/zhaoyuanh/article/details/145914790

版权

大模型专栏收录该内容

16 篇文章

订阅专栏

背景

最近同事在公司的服务器上使用SGLang部署了满血版DeepSeek R1，但在请求时有个奇怪的问题，就是接口返回的文本开头没有<think>这个标签，但是却又对应的</think>标签，如下所示：

{
	"id":"ccdb5eb2ab4f432ea91cb22cdef8bebe",
	"object":"chat.completion",
	"created":1740644733,
	"model":"default",
	"choices":[
		{
			"index":0,
			"message":
				{
					"role":"assistant",
					"content":"嗯，用户发了个“hello？”，看起来像是在测试或者看看有没有人在线。我需要确认他们是不是需要帮助，或者有没有具体的问题要问。应该保持友好，开放式的回应，鼓励他们提出具体的问题或者需求。同时注意语气要亲切，避免显得机械。\n</think>\n\nHello! How can I assist you today? 😊",
					"tool_calls":null
				},
			"logprobs":null,
			"finish_reason":"stop",
			"matched_stop":1
		}
	],
	"usage":{"prompt_tokens":9,"total_tokens":81,"completion_tokens":72,"prompt_tokens_details":null}
}

原因

到SGLang的GitHub项目上搜了一下，发现很多人都遇到这个问题：https://github.com/sgl-project/sglang/issues/3668

最后在这里找到了原因：https://github.com/sgl-project/sglang/issues/3620，原来是DeepSeek R1的tokenizer配置的chat_template会在输入给模型的prompt后面添加<think>\n，模型从这个标签后开始生成数据，因此返回的文本开头就没有这个<think>标签。

这其实是DeepSeek团队为了保证模型效果采取的做法，在DeepSeek-R1的官网中也提到了：

Additionally, we have observed that the DeepSeek-R1 series models tend to bypass thinking pattern (i.e., outputting “<think>\n\n</think>”) when responding to certain queries, which can adversely affect the model’s performance. To ensure that the model engages in thorough reasoning, we recommend enforcing the model to initiate its response with “<think>\n” at the beginning of every output.

所以如果你不想在客户端自己添加<think>，而是希望模型自己能够输出，那么可以将模型文件中的 tokenizer配置文件的chat_template后面增加的<think>\\n去掉然后再重新部署模型。