https://python.langchain.com.cn/docs/modules/model_io/prompts/example_selectors/similarity
Semantic Similarity Example Selector
This content is based on LangChain’s official documentation (langchain.com.cn) and explains the SemanticSimilarityExampleSelector—a tool that selects examples by semantic similarity to the input—in simplified terms. It strictly preserves all original source codes, examples, and knowledge points without any additions or modifications.
1. What is SemanticSimilarityExampleSelector?
This selector chooses examples based on semantic similarity (meaning-based similarity) to the user’s input.
- It converts both the input and examples into numerical representations called “embeddings” (using an embedding model like
OpenAIEmbeddings). - It measures similarity using cosine similarity (a method to calculate how closely two embeddings align).
- It retrieves the top
k(specified number) examples with the highest similarity to the input. - It uses a
VectorStore(e.g.,Chroma) to store and search embeddings efficiently.
2. Step 1: Import Required Modules
The code below imports all necessary LangChain classes—exactly as in the original documentation:
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
3. Step 2: Prepare Examples
We use the same “creating antonyms” example list from the original text:
# These are a lot of examples of a pretend task of creating antonyms.
examples = [
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"},
{"input": "energetic", "output": "lethargic"},
{"input": "sunny", "output": "gloomy"},
{"input": "windy", "output": "calm"},
]
4. Step 3: Create example_prompt
This PromptTemplate defines how each example is formatted (matching the original structure):
example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}",
)
5. Step 4: Initialize the Semantic Similarity Selector
Configure the selector with examples, embedding model, vector store, and the number of examples to select (k=1). The code is identical to the original:
example_selector = SemanticSimilarityExampleSelector.from_examples(
# This is the list of examples available to select from.
examples,
# This is the embedding class used to produce embeddings for similarity measurement.
OpenAIEmbeddings(),
# This is the VectorStore class that stores embeddings and enables similarity search.
Chroma,
# This is the number of examples to retrieve (top k most similar).
k=1
)
When running the code, the following log (from the original documentation) may appear (it indicates Chroma is running in local in-memory mode):
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
6. Step 5: Create a Dynamic Prompt
Combine the selector with a prefix (instruction) and suffix (user input placeholder) using FewShotPromptTemplate:
similar_prompt = FewShotPromptTemplate(
# We provide an ExampleSelector instead of direct examples.
example_selector=example_selector,
example_prompt=example_prompt,
prefix="Give the antonym of every input",
suffix="Input: {adjective}\nOutput:",
input_variables=["adjective"],
)
7. Step 6: Test the Selector (3 Scenarios)
We test the selector with different inputs—exactly as in the original documentation, including code and outputs.
Scenario 1: Input is a Feeling ("worried")
The input "worried" (a feeling) is most similar to "happy" (also a feeling). The selector retrieves the happy→sad example.
Code:
# Input is a feeling, so should select the happy/sad example
print(similar_prompt.format(adjective="worried"))
Output (exact as original):
Give the antonym of every input
Input: happy
Output: sad
Input: worried
Output:
Scenario 2: Input is a Measurement ("fat")
The input "fat" (a physical measurement) is most similar to "tall" (also a physical measurement). The selector retrieves the tall→short example.
Code:
# Input is a measurement, so should select the tall/short example
print(similar_prompt.format(adjective="fat"))
Output (exact as original):
Give the antonym of every input
Input: tall
Output: short
Input: fat
Output:
Scenario 3: Add a New Example
You can add new examples to the selector using add_example(). The selector will now include the new example in similarity searches.
Code:
# You can add new examples to the SemanticSimilarityExampleSelector as well
similar_prompt.example_selector.add_example({"input": "enthusiastic", "output": "apathetic"})
# Test with a new feeling-related input ("joyful")
print(similar_prompt.format(adjective="joyful"))
Output (exact as original—retrieves the most similar happy→sad example):
Give the antonym of every input
Input: happy
Output: sad
Input: joyful
Output:
Would you like me to generate a simplified cheat sheet for SemanticSimilarityExampleSelector key parameters, summarizing their roles and default values?
1513

被折叠的 条评论
为什么被折叠?



