LangChain 入门8 加载huggingface模型的案例对比

最新推荐文章于 2025-03-07 08:52:45 发布

猫在上海

最新推荐文章于 2025-03-07 08:52:45 发布

阅读量2.9k

点赞数 25

分类专栏： Langchain 学习与入门文章标签： langchain nlp

本文链接：https://blog.csdn.net/weixin_41870426/article/details/138400182

版权

Langchain 学习与入门专栏收录该内容

8 篇文章

订阅专栏

本文详细比较了HuggingFace模型与Ollama在使用场景、安装部署、LangChain集成、性能需求、易用性和开源性等方面的异同，展示了两种模型在实际应用中的不同特点。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

概述

huggingface模型和ollama 的一些区别

langchain加载 huggingface模型和ollama 的一些区别在于它们的使用场景、安装部署方式、以及与LangChain的集成方式。

使用场景：
Hugging Face模型通常托管在Hugging Face Model Hub上，可以用于多种自然语言处理任务，如文本分类、问答、文本生成等。
Ollama专注于运行大型语言模型（LLMs），如Llama 2和Mistral，提供了本地运行这些模型的能力。
安装部署方式：
Hugging Face模型可以通过Hugging Face Hub直接调用，也可以本地安装和运行。如果选择本地部署，需要安装transformers库和可能的其他依赖，如sentence_transformers。
Ollama提供了一个命令行界面，通过ollama命令与模型进行交互，它使用llama.cpp作为底层库，并在此基础上添加了额外的功能。
LangChain的集成方式：
Hugging Face模型在LangChain中可以通过HuggingFacePipeline类进行集成，这允许用户直接使用Hugging Face的模型进行文本生成或其他任务。
Ollama模型在LangChain中的使用可能需要通过特定的接口或者适配器来实现，这取决于Ollama提供的API和LangChain的集成能力。
性能和资源需求：
Hugging Face模型的性能和资源需求取决于所选模型的大小和复杂性，以及是否使用GPU加速。
Ollama由于专注于大型语言模型，可能需要较高的硬件配置，特别是对于大型模型，如7B、13B或更大的模型。
易用性和定制性：
Hugging Face提供了广泛的模型选择和易于使用的API，适合需要快速原型设计和模型测试的用户。
Ollama则提供了更多的定制选项，允许用户通过创建自定义模型和运行多个预训练模型来满足特定需求。
开源和社区支持：
Hugging Face模型完全开源，并有一个活跃的社区支持，用户可以轻松地贡献和共享模型。
Ollama同样开源，提供了一个模型库，用户可以从中选择或上传自己的模型。

代码实现

案例1-有prompt

#加载依赖
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain_core.prompts import PromptTemplate


#加载本地模型
#model_id = "G:\hugging_fase_model2\gemma-7b"
model_id = "G:\hugging_fase_model2\gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

# 增加 max_new_tokens 的值以生成更长的文本
max_new_tokens = 200  # 可以根据需要调整这个值

#构建管道
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=max_new_tokens)
hf = HuggingFacePipeline(pipeline=pipe)

#构建提示词模版
template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

#LCEL
chain = prompt | hf

#提问
question = "What is electroencephalography?"

#输出
print(chain.invoke({"question": question}))

返回数据：

In this article a number of studies demonstrate the potential for
neurofeedback. Some of the techniques used provide stimulation. One
research in the area of EEG (electroencephalography) studies, called
electroencephalographic stimulation, involves electrical stimulation,
such as ultrasound, lasers, and some sound. Another study, published
in the journal EEG by N. E. Hirschberger, shows the use of ultrasonic
pulses, such as those generated by a camera, as an acoustic
stimulation. The results should prompt us to take some actions.

First, we have to understand a few basic principles regarding the
field. First, electroencephalography is applied in three different
modes. The first uses a sound field. The second uses electrical
stimulation (pulse amplitude) which the user then sends to the
electrodes to activate the auditory stimulus. The third uses a pulse
amplitude, known as field oscillation (fascination) which involves the
electrical stimulation of the scalp of the subject. The final field

案例2-没有prompt 不使用invoke

#加载依赖
from langchain import PromptTemplate, LLMChain
from langchain.llms import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM
from langchain_community.llms import HuggingFaceHub


#实例化模型
model_id = "G:\hugging_fase_model2\gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = HuggingFacePipeline.from_model_id(
    model_id=model_id,
    task="text-generation",
    pipeline_kwargs={"max_new_tokens": 100},
    device=-1,
    )

#输出结果
model('What is electroencephalography?')

" It’s a very small optical instrument that collects EEG data from a
subject using a handheld electric current.\n\nWhat is
electroencephalography? It’s a very small optical instrument that
collects EEG data from a subject using a handheld electric current.
The image is in real time. We use it just for the visuals and to have
it look real; it’s not like that. It’s just like seeing a person in a
hospital with cerebral palsy. You get this low level of consciousness,
and"

案例3-使用run 执行案例

from langchain_community.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain import PromptTemplate, LLMChain

#加载大模型
model_id = "G:\hugging_fase_model2\gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

#加载管道
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100)

#加载huggingface 管道
model = HuggingFacePipeline(pipeline=pipe)

#创建提示词
template = """Question: {question}

Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=model)

#运行大模型
question = "What is electroencephalography?"
response = llm_chain.run(question)
print(response)

This is an interesting topic which has not been thoroughly discussed
in the literature. In fact, some people have suggested there may not
even be an appropriate definition (see the below). In this document we
would like to discuss whether electroencephalography is a better
alternative, and are therefore going to write up this, in a short
manner.

A. Electroencephalography is a method to measure the electrical
activity or electroconvulsive activity of brain tissue.
Electroencephal

本文主要对hugging 模型的加载从不同的加载方式来看返回的结果上的差异。
以上是文本的全部内容，感谢阅读。