LangChain 2 基础用法

1. LangChain 聊天模板（ChatPromptTemplate）用法

1.1 通过消息数组创建聊天消息模板

用 ChatPromptTemplate.from_messages() 创建聊天消息模板。
每条消息由“角色/类型”+“内容”组成。常见角色有：
- system：系统消息
- human：用户消息
- ai：AI/助手回复
消息内容支持插入参数（如 {name}、{user_input}），后续用 .format_messages() 传值渲染。

示例代码：

from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages([
    ("system", "你是一位人工智能助手，你的名字是{name}。"),
    ("human", "你好"),
    ("ai", "我很好，谢谢！"),
    ("human", "{user_input}"),
])

messages = chat_template.format_messages(name="Bob", user_input="你的名字叫什么？")
print(messages)

输出结果为格式化后的消息列表，带入了具体参数。

1.2 使用消息类（SystemMessage, HumanMessagePromptTemplate）方式

支持直接用消息类（如 SystemMessage、HumanMessagePromptTemplate）声明消息，更加类型安全和结构清晰。
SystemMessage 可指定内容，HumanMessagePromptTemplate.from_template() 支持模板变量。

示例代码：

from langchain.prompts import HumanMessagePromptTemplate
from langchain_core.messages import SystemMessage
from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages([
	# 声明消息
    SystemMessage(content="你是一个乐于助人的助手，可以润色内容，使其看起来更简单易读。"),
    HumanMessagePromptTemplate.from_template("{text}"),
])

messages = chat_template.format_messages(text="我不喜欢吃好吃的东西")
print(messages)

1.3 MessagesPlaceholder 用法简介

作用：MessagesPlaceholder 用于在 ChatPromptTemplate 中占位，让用户可以动态插入一组消息（如多轮对话历史）到指定位置。
典型场景：在多轮对话、需要插入动态消息列表时非常实用。

示例代码：

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage

# 创建带占位符的聊天模板
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    MessagesPlaceholder("msgs"),  # 这里 "msgs" 是参数名
])

# 调用模板并动态插入消息列表
prompt_template.invoke({
    "msgs": [HumanMessage(content="hi!")]
})

占位符 "msgs" 可接收消息对象列表（如 HumanMessage、AIMessage 等）。
这让 prompt 能灵活插入对话历史，实现个性化、多轮上下文聊天。

1.4 Few-shot Prompt Templates（少样本提示词模板）

作用：少样本提示词模板通过给模型提供几个示例，帮助其更好理解问题并生成答案。适合没有太多数据的场景，能通过少量的示例指导模型更好地处理新输入。

示例1：问答模板

examples = [
    {
        "question": "什么是螃蟹侠？", 
        "answer": "螃蟹侠是一个虚构的漫画人物。"
    },
    {
        "question": "什么是torsalplexity?", 
        "answer": "未知。"
    },
    {
        "question": "什么是语言模型？", 
        "answer": "语言模型是用于生成自然语言文本的机器学习模型。"
    }
]

示例2：使用 FewShotPromptTemplate 进行模型训练

from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate

# 定义 few-shot 示例
examples = [
    {
        "question": "谁的寿命更长，穆罕默德·阿里还是艾伦·图灵？",
        "answer": """
			这里需要跟进问题吗，是的。
			跟进：穆罕默德·阿里去世时多大？
			中间答案：穆罕默德·阿里去世时74岁。
			跟进：艾伦·图灵去世时多大？
			中间答案：艾伦·图灵去世时41岁。
			所以最终答案是：穆罕默德·阿里
			"""
    },
    {
        "question": "craigslist的创始人是什么时候出生的？",
        "answer": """
			这里需要跟进问题吗，是的。
			跟进：craigslist的创始人是谁？
			中间答案：craigslist由Craig Newmark创立。
			跟进：Craig Newmark是什么时候出生的？
			中间答案：Craig Newmark于1952年12月6日出生。
			所以最终答案是：1952年12月6日
			"""
    },
    {
	    "question": "乔治·华盛顿的祖父母中的母亲是谁？",
	    "answer": """
		    这里需要跟进问题吗：是的。
			跟进：乔治·华盛顿的母亲是谁？
			中间答案：乔治·华盛顿的母亲是Mary Ball Washington。
			跟进：Mary Ball Washington的父亲是谁？
			中间答案：Mary Ball Washington的父亲是Joseph Ball。
			所以最终答案是：Joseph Ball
		"""
    }
]

# 定义单个问答示例的渲染模板
example_prompt = PromptTemplate(
    input_variables=["question", "answer"],
    template="问题：{question}\n{answer}"
)

# 构建 FewShotPromptTemplate
prompt = FewShotPromptTemplate(
    examples=examples,  # 示例问答对
    example_prompt=example_prompt,  # 渲染格式
    suffix="问题：{input}",         # 用户输入部分
    input_variables=["input"]       # 用户输入变量名
)

# 格式化生成完整的 Prompt
print(prompt.format(input="乔治·华盛顿的父亲是谁？"))

总结：
- Few-shot Prompt Templates 通过提供少量示例，帮助模型更好理解问题和回答结构。
- 模板可以灵活配置，包括输入输出格式、示例提取与格式化等。
- 适用于数据较少的场景，让模型能够更好地进行泛化。

1.5 创建小样本示例的格式化``

在 Python 里，用 ** 可以把一个字典里的键值对，展开成函数的命名参数。
在你的例子里，example_prompt.format(**example) 等价于
example_prompt.format(question=..., answer=...)。
这种用法特别适合批量渲染模板，不用手动一个个传变量，字典变量名和模板变量名一一对应就行。

例子

example_prompt = PromptTemplate(
    input_variables=["question", "answer"],
    template="问题：{question}\n{answer}"
)

examples = {"question": "乔治·华盛顿的父亲是谁？", "answer": "Augustine Washington"}
print(example_prompt.format(**examples))

输出：

1 2	`问题：乔治·华盛顿的父亲是谁？ Augustine Washington`

1.6 示例选择器 ExampleSelector 的用法

ExampleSelector 用于动态选择最合适的 few-shot 示例，而不是每次把所有示例都插入提示词。
常用场景：few-shot prompt，根据输入自动挑选与输入最相似的样本问答，提升 LLM 回答的准确性和相关性。
安装：pip install langchain-community
安装：pip install chromadb
主要思路
- 将所有示例（examples）文本通过 embedding（如 OpenAI 的 text-embedding-ada-002）转为向量。
- 查询时，把用户输入也转为向量，在向量空间里计算与每个示例的相似度（如余弦相似度）。
- 只返回最相似的 k 个示例，拼接到 prompt 中。

实现代码

from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# 定义示例
examples = [
    {"question": "...", "answer": "..."},
    # 更多示例...
]

# 构建 ExampleSelector
example_selector = SemanticSimilarityExampleSelector.from_examples(
    examples,                 # 可选示例列表
    OpenAIEmbeddings(),       # 嵌入模型，默认用 ada-002
    Chroma,                   # 向量存储与相似度检索
    k=1                       # 每次选择 1 个最相似的示例
)

# 查询时选出与输入最相似的示例
question = "乔治·华盛顿的父亲是谁？"
selected_examples = example_selector.select_examples({"question": question})
print(f"最相似的示例：{question}")
#for example in selected_examples:  
#    print(example)  
  
for example in selected_examples:  
    print("Q:", example['question'])  
    print("A:", example['answer'].strip())  # .strip()去掉多余空行  
    print("="*20)

输出：

最相似的示例：乔治·华盛顿的父亲是谁？
answer:
这里需要跟进问题吗：是的。
跟进：乔治·华盛顿的母亲是谁？
中间答案：乔治·华盛顿的母亲是Mary Ball Washington。
跟进：Mary Ball Washington的父亲是谁？
中间答案：Mary Ball Washington的父亲是Joseph Ball。
所以最终答案是：Joseph Ball
question: 乔治·华盛顿的祖父母中的母亲是谁？

1.7 将示例选择器提供给 FewShotPromptTemplate

核心流程： 可以将 example_selector（如前面构建的 SemanticSimilarityExampleSelector）作为参数，传递给 FewShotPromptTemplate，让它自动选择与输入最相似的示例填充进 prompt。

代码示例：

prompt = FewShotPromptTemplate(
    example_selector=example_selector,     # 自动选例器
    example_prompt=example_prompt,         # 单个示例的模板格式
    suffix="问题：{input}",                 # 用户实际输入部分
    input_variables=["input"]              # 输入变量名
)
# 和input 最相似的 QA对
print(prompt.format(input="乔治·华盛顿的父亲是谁？"))