实战 - vLLM部署 QwQ-32B-AWQ [单卡4090]

在这里插入图片描述


关于 QwQ 系列模型,可见:https://ezcode.blog.csdn.net/article/details/146080810


运行服务

vllm serve Qwen/QwQ-32B-AWQ --max-model-len 5680 

安装 vllm

如果你没有安装 vllm,可以创建新环境来安装:

conda create -n e39 python=3.9

conda activate e39

pip install vllm

下载模型

你也可以提前下载模型:

huggingface-cli download Qwen/QwQ-32B-AWQ

为了网络加速,你可以在环境变量中设置 huggingface 镜像

export HF_ENDPOINT='https://hf-mirror.com'

请求调用

curl - completions

curl http://10.0.1.24:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Qwen/QwQ-32B-AWQ",
        "prompt": "San Francisco is a",
        "max_tokens": 7,
        "temperature": 0
    }'

{
	"id": "cmpl-cc22cd6e54ab4cbdb5db3e6ba2cb994c",
	"object": "text_completion",
	"created": 1741312227,
	"model": "Qwen/QwQ-32B-AWQ",
	"choices": [{
		"index": 0,
		"text": " city of neighborhoods, each with its",
		"logprobs": null,
		"finish_reason": "length",
		"stop_reason": null,
		"prompt_logprobs": null
	}],
	"usage": {
		"prompt_tokens": 4,
		"total_tokens": 11,
		"completion_tokens": 7,
		"prompt_tokens_details": null
	}
}

curl - chat/completions

curl http://10.0.1.24:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "Qwen/QwQ-32B-AWQ",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the world series in 2020?"}
        ]
    }'


{
	"id": "chatcmpl-274dd20da7b6453a84d1d2600da96ab4",
	"object": "chat.completion",
	"created": 1741325805,
	"model": "Qwen/QwQ-32B-AWQ",
	"choices": [{
		"index": 0,
		"message": {
			"role": "assistant",
			"reasoning_content": null,
			"content": "Okay, the user is asking who won the World Series in 2020. Let me think. The World Series is the championship series of Major League Baseball, right? It's the playoff event to determine the champion. \n\nFirst, I need to remember or figure out which teams were involved in 2020. I recall that the 2020 season was affected by the pandemic. I think the season was shortened, but they still played the World Series. \n\nWait, the Los Angeles Dodgers were really strong that year. I think they made it to the World Series. On the other side, maybe the Tampa Bay Rays? They had a good run too. But was that the case?\n\nLet me try to remember the outcome. I think the Dodgers won it. They had a strong lineup, including players like Mookie Betts and Clayton Kershaw. The Rays were underdogs but had a surprising run. \n\nTo be sure, the 2020 World Series was the 116th edition, right? The Dodgers defeated the Tampa Bay Rays in six games. The deciding game was at Dodger Stadium. The key player might have been Walker Buehner with a walk-off home run in Game 6. That's a memorable moment. So the answer is the Los Angeles Dodgers. I should confirm the details to be accurate. The Rays were the American League champions, and the Dodgers the National League. The series was held in Arlington, Texas, due to the pandemic, but the final game was in LA? Wait, no, maybe all games were in Texas that year because of travel restrictions?\n\nHmm, actually, in 2020, because of the pandemic, all the World Series games were played at a single site, Globe Life Park in Arlington, Texas. The Dodgers won their first World Series since 1988. That's a key point. So the answer is definitely the Los Angeles Dodgers. I should also mention the opponent, the Tampa Bay Rays, and maybe a brief note about the context of the pandemic affecting the location.\n</think>\n\nThe **Los Angeles Dodgers** won the 2020 World Series, defeating the **Tampa Bay Rays** in six games. The championship was held at Globe Life Field in Arlington, Texas, due to pandemic-related travel restrictions. The Dodgers secured their first World Series title since 1988, with Cody Bellinger and Mookie Betts among the key contributors. The clinching Game 6 ended in dramatic fashion when Corey Seager hit a walk-off single to win the game, and ultimately the series. This marked one of the most memorable World Series runnings in recent history.",
			"tool_calls": []
		},
		"logprobs": null,
		"finish_reason": "stop",
		"stop_reason": null
	}],
	"usage": {
		"prompt_tokens": 33,
		"total_tokens": 584,
		"completion_tokens": 551,
		"prompt_tokens_details": null
	},
	"prompt_logprobs": null
}

Python - completion

from openai import OpenAI

openai_api_key = "EMPTY"
openai_api_base = "http://10.0.1.24:8000/v1"
client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)
completion = client.completions.create(
	model="Qwen/QwQ-32B-AWQ",
	prompt="San Francisco is a",
	max_tokens=500
)
print("Completion result:", completion)

Completion(
	id='cmpl-88135d4bbab946d19713beeae22020e9', 
	choices=[CompletionChoice(
		finish_reason='length', index=0, logprobs=None, 
		text=' city in California on the coast of the Pacific Ocean. It has a population of 815,000. The San Francisco Bay is an arm of the Pacific Ocean. San Francisco is on a peninsula between the San Francisco Bay and the Pacific Ocean. The Pacific Ocean is to the west of the city, and the San Francisco Bay is to the east. The Franciscan7 seat of the Roman Catholic Diocese8 of San Francisco is in the city. The headquarters of the United Nations is in San Francisco. The city\'s motto is "Love of the United States". A region with fog along the coast is home to San Francisco.  San Francisco is a West Coast city. San Francisco was founded in 1776. San Francisco became a city in 1850. The 1906 earthquake9 destroyed San Francisco. The golden gate bridge was built in 1937. San Francisco is famous for its steep hills and cable cars. San Francisco is a city of the world. More than 80 countries have consulates10 in San Francisco. The climate in San Francisco is cool and damp. Pine9 forests surround the city. The sun shines in San Francisco only five days a week in spring. San Francisco is a city of learning. The University of California at Berkeley is north of the city. Stanford University is south of the city. San Francisco is a city of homes. It is a great city, but it is also a city of neighborhoods. What makes San Francisco different? It\'s the people. People from many nations and different backgrounds have come to San Francisco. San Francisco is a city of jokes. The San Francisco Chronicle is the oldest newspaper west of the Mississippi River. 在1906年的地震中,旧金山市被破坏。金门大桥建于1937年。旧金山以陡峭的山丘和有缆索的缆车而闻名。 这座城市是斯坦福大学和加州大学伯克利分校所在地。 旧金山是一座著名的国际都市,,拥有来自许多国家和不同背景的人们。这座城市以其独特的社区而自豪。\n\nA computer virus is a program that can copy itself and infect a computer. A computer virus (as defined by Fred Cohen in 1984) is a program that can copy itself and infect a computer when executed. Computer viruses infect other computer programs by modifying them. A virus might infect an application\'s', 
		stop_reason=None, 
		prompt_logprobs=None)], 
		created=1741326318, 
		model='Qwen/QwQ-32B-AWQ', 
		object='text_completion', 
		system_fingerprint=None, 
		usage=CompletionUsage(
			completion_tokens=500, 
			prompt_tokens=4, 
			total_tokens=504, 
			completion_tokens_details=None, 
			prompt_tokens_details=None
	)
)

Python - chat completion

from openai import OpenAI
# Set OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://10.0.1.24:8000/v1"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

chat_response = client.chat.completions.create(
    model="Qwen/QwQ-32B-AWQ",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."},
    ]
)

print("Chat response:", chat_response)



ChatCompletion(
	id='chatcmpl-8ae086bda227469da4d4657582187438', 
	choices=[Choice(
		finish_reason='stop', index=0, logprobs=None, 
		message=ChatCompletionMessage(
			content='Okay, the user asked for a joke. Let me think of a good one. Maybe something light and not offensive. I should pick a classic setup with a unexpected twist. How about a pun? Those are usually safe.\n\nLet me recall some categories. Animals are always a good start. Maybe a dog or a cat joke. Wait, the classic "Why did the chicken cross the road?" is too common. Need something less obvious. Oh, maybe a computer-related joke since we\'re in tech. Hmm, but maybe too niche. \n\nWhat about a play on words? "I used to be a baker but I couldn\'t make enough dough." That\'s a pun on dough referring to both bread and money. But maybe too simple. Let me check another option. A mathematician joke? "Why was the equal sign so humble? Because it knew it wasn\'t less than or greater than anyone else." That\'s clever but might be too obscure. \n\nAlternatively, a coffee-related joke? "I told my wife she was drawing her eyebrows too high. She looks surprised." Wait, no, that one is about眉毛 and surprise表情. Wait, actually, yes. The joke is a play on the word "surprised" because high eyebrows are associated with surprise. That\'s better. It\'s simple and clear. \n\nAlternatively, "Why don\'t skeletons fight each other? They don\'t have the guts." That\'s another good pun. Maybe that\'s better. Let me decide. The eyebrows one is shorter. Let me go with the eyebrows one. Let me confirm the wording. "I told my wife she was drawing her eyebrows too high. She looked surprised." Yeah, that works. It\'s concise and the pun is clear. Alright, I\'ll go with that.\n</think>\n\nSure! Here\'s a light-hearted one for you:\n\n*"I told my wife she was drawing her eyebrows too high.  \nShe looked surprised."*  \n\nGot it? 😊', 
			refusal=None, 
			role='assistant', 
			audio=None, 
			function_call=None, 
			tool_calls=[], 
			reasoning_content=None), 
			stop_reason=None
		)
	], 
			
	created=1741312435, 
	model='Qwen/QwQ-32B-AWQ', 
	object='chat.completion', 
	service_tier=None, 
	system_fingerprint=None, 
	usage=CompletionUsage(
		completion_tokens=399, 
		prompt_tokens=26, 
		total_tokens=425, 
		completion_tokens_details=None, 
		prompt_tokens_details=None
	), 
	prompt_logprobs=None
)
'''



2024-03-07(五)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

编程乐园

请我喝杯伯爵奶茶~!

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值