LLM 中什么是Prompts？如何使用`LangChain` 快速实现Prompts 一
#

Prompt是一种基于自然语言处理的交互方式，它通过机器对自然语言的解析，实现用户与机器之间的沟通。 Prompt主要实现方式是通过建立相应的语料库和语义解析模型，来将自然语言转换为机器可识别的指令。 Prompt是一种计算机编程语言，它被广泛用于自然语言处理(NLP)和人工智能(AI)领域。

Prompt templage 是用于生成语言模型提示的预定义方案。

模板可以包括说明、少量示例以及适合给定任务的特定上下文和问题。

LangChain 提供了创建和使用提示模板的工具。

LangChain 致力于创建与模型无关的模板，以便能够轻松地跨不同语言模型重用现有模板。

通常LLM 期望提示是字符串或聊天消息列表。

`PromptTemplate`
#

用于PromptTemplate创建字符串提示的模板。

默认情况下，PromptTemplate使用 Python 的 str.format 语法进行模板化。

该模板支持任意数量的变量，包括无变量：

from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template(
    "Tell me a {adjective} joke about {content}."
)
prompt_template.format(adjective="funny", content="chickens")

# > 'Tell me a funny joke about chickens.'

# 无变量
prompt_template = PromptTemplate.from_template("Tell me a joke")
prompt_template.format()
# > 'Tell me a joke'

PromptTemplate 一般使用在单轮对话中。不需要历史记忆的场景.

`ChatPromptTemplate`
#

ChatPromptTemplate 聊天消息列表,每条聊天消息都与内容以及附加参数相关联role。例如聊天消息可以与 AI 助手、人类或系统角色相关联。

创建一个这样的聊天提示模板：

from langchain_core.prompts import ChatPromptTemplate

chat_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful AI bot. Your name is {name}."),
        ("human", "Hello, how are you doing?"),
        ("ai", "I'm doing well, thanks!"),
        ("human", "{user_input}"),
    ]
)

messages = chat_template.format_messages(name="Bob", user_input="What is your name?")

ChatPromptTemplate.from_messages接受各种消息表示形式。

例如除了使用上面使用的 (type, content) 的二元组表示之外，我们还可以传入 MessagePromptTemplate的实例BaseMessage。

chat_template = ChatPromptTemplate.from_messages(
    [
       # 这里跟上面的 system 的作用是一致的
        SystemMessage(
            content=(
                "You are a helpful assistant that re-writes the user's text to "
                "sound more upbeat."
            )
        ),
        HumanMessagePromptTemplate.from_template("{text}"),
    ]
)
messages = chat_template.format_messages(text="I don't like eating tasty things")
print(messages)

这样为我们构建聊天提示的方式提供了很大的灵活性。

LECL 方式
#

PromptTemplate 与 ChatPromptTemplate都实现Runnable接口。这意味着它们支持invoke、 ainvoke、stream、astream、batch、abatch、astream_log 函数的调用。

PromptTemplate接受（提示变量的）字典并返回一个StringPromptValue. ChatPromptTemplate接受一个字典并返回一个ChatPromptValue。

prompt_val = prompt_template.invoke({"adjective": "funny", "content": "chickens"})
# StringPromptValue(text='Tell me a funny joke about chickens.')
prompt_val.to_string()
# > Tell me a funny joke about chickens.

prompt_val.to_messages()
#> [HumanMessage(content='Tell me a joke')]

另一个例子

chat_val = chat_template.invoke({"text": "i dont like eating tasty things."})
chat_val.to_messages()

#> [SystemMessage(content="You are a helpful assistant that re-writes the user's text to sound more upbeat."),HumanMessage(content='i dont like eating tasty things.')]

# 转换为字符串
chat_val.to_string()

#> "System: You are a helpful assistant that re-writes the user's text to sound more upbeat.\nHuman: i dont like eating tasty things."

使用类型消息
#

聊天提示由消息列表组成。纯粹为了创建这些提示方便我们开发人员添加的一种的便捷方法。在此管道中，每个新元素都是最终提示中的一条新消息。

from langchain_core.messages import AIMessage, HumanMessage, SystemMessage

首先，让我们使用系统消息初始化基本 ChatPromptTemplate。不一定要从系统开始，但这通常是比较好的做法。

prompt = SystemMessage(content="You are a nice pirate")

然后我们可以轻松创建将其与其他消息或消息模板相结合的管道。

当没有要格式化的变量时使用Message，当有要格式化的变量时使用MessageTemplate。还可以仅使用一个字符串（注意：这将自动推断为 HumanMessagePromptTemplate）

new_prompt = (
    prompt + HumanMessage(content="hi") + AIMessage(content="what?") + "{input}"
)

这样LangChain会创建 ChatPromptTemplate 类的一个实例，因此我们可以像以前一样使用它！

new_prompt.format_messages(input="i said hi")

# 输出
[SystemMessage(content='You are a nice pirate', additional_kwargs={}),
 HumanMessage(content='hi', additional_kwargs={}, example=False),
 AIMessage(content='what?', additional_kwargs={}, example=False),
 HumanMessage(content='i said hi', additional_kwargs={}, example=False)]

也可以在LLMChain一样在使用它。

from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
chain = LLMChain(prompt=new_prompt, llm=llm)
chain.run("I said HI!")

选择器
#

名称	描述
相似 similarity	使用输入和示例之间的语义相似性来决定选择哪些示例。
MMR	使用输入和示例之间的最大边际相关性来决定选择哪些示例。
length_based	根据一定长度内可以容纳的数量来选择示例
Ngram	使用输入和示例之间的 ngram 重叠来决定选择哪些示例。

长度选择
#

长度选择器根据长度选择要使用的示例。当我们担心构建的提示会超过上下文窗口的长度时，这非常有用。对于较长的输入，它将选择较少的示例来包含，而对于较短的输入，它将选择更多的示例。

from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.prompts.example_selector import LengthBasedExampleSelector

# 制作反义词的任务示例。
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)
example_selector = LengthBasedExampleSelector(
    # 可供选择的示例。
    examples=examples,
    #用于格式化示例的PromptTemplate。
    example_prompt=example_prompt,
    # 格式化示例的最大长度。长度由下面的get_text_length函数来衡量。
    max_length=25,
    # 用于获取字符串长度的函数，用于确定要包含哪些示例。因为如果未指定，默认值将会提供。
    # get_text_length: Callable[[str], int] = lambda x: len(re.split("\n| ", x))
)
dynamic_prompt = FewShotPromptTemplate(
    # 我们提供一个示例选择器
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)

print(dynamic_prompt.format(adjective="big"))

Give the antonym of every input

Input: happy
Output: sad

Input: tall
Output: short

Input: energetic
Output: lethargic

Input: sunny
Output: gloomy

Input: windy
Output: calm

Input: big
Output:

一个包含长输入的示例，所以它只选择了一个示例。

long_string = "big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else"
print(dynamic_prompt.format(adjective=long_string))

Give the antonym of every input

Input: happy
Output: sad

Input: big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else
Output:

(MMR) 选择
#

MaxMarginalRelevanceExampleSelector根据与输入最相似的示例的组合来选择示例，同时还针对多样性进行优化。它通过查找与输入具有最大余弦相似度的嵌入示例来实现这一点，然后迭代地添加它们，同时排除它们与已选择示例的接近程度。

看个例子：
#

from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.prompts.example_selector import (
    MaxMarginalRelevanceExampleSelector,
    SemanticSimilarityExampleSelector,
)
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

#创建反义词的假装任务的示例。
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

example_selector = MaxMarginalRelevanceExampleSelector.from_examples(
    # 可以选择的示例列表。
    examples,
    # 用于生成嵌入的嵌入类，用于衡量语义相似性。
    OpenAIEmbeddings(),
    # 用于存储嵌入并进行相似度搜索的VectorStore类。
    FAISS,
    # 需要生成的示例数量。
    k=2,
)
mmr_prompt = FewShotPromptTemplate(
    #我们提供 ExampleSelector
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)

输入worried 是一种感觉类的词汇，所以应该选择愉快/悲伤的例子作为第一个。

print(mmr_prompt.format(adjective="worried"))

# 让我们将这与仅仅基于相似性得到的结果进行比较。请使用SemanticSimilarityExampleSelector代替MaxMarginalRelevanceExampleSelector。
example_selector = SemanticSimilarityExampleSelector.from_examples(
    # 可供选择的示例列表。
    examples,
    #向量相似性检索
    OpenAIEmbeddings(),
    #用于存储嵌入并进行相似性搜索的 VectorStore 类。
    FAISS,
    k=2,
)
similar_prompt = FewShotPromptTemplate(
    # 供了一个示例选择器，而不仅仅是具体的示例。
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)
print(similar_prompt.format(adjective="worried"))

Ngram重叠选择
#

NGramOverlapExampleSelector根据 ngram 重叠分数，根据与输入最相似的示例来选择示例并对其进行排序。ngram 重叠分数是 0.0 到 1.0 之间的浮点数（含 0.0 和 1.0）。

选择器允许设置阈值分数。ngram 重叠分数小于或等于阈值的示例被排除。默认情况下，阈值设置为 -1.0，因此不会排除任何示例，只会对它们重新排序。将阈值设置为 0.0 将排除与输入没有 ngram 重叠的示例。

from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.prompts.example_selector.ngram_overlap import NGramOverlapExampleSelector

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

# 翻译任务的示例
examples = [
    {"input": "See Spot run.", "output": "Ver correr a Spot."},
    {"input": "My dog barks.", "output": "Mi perro ladra."},
    {"input": "Spot can run.", "output": "Spot puede correr."},
]

example_selector = NGramOverlapExampleSelector(
    # 可以选择的示例。
    examples=examples,
    # 正在使用的 PromptTemplate 用于格式化示例。
    example_prompt=example_prompt,
    # 选择器停止的阈值。默认是 -1.0
    threshold=-1.0,
)
dynamic_prompt = FewShotPromptTemplate(
    #  我们提供一个示例选择器。
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the Spanish translation of every input",
    suffix="Input: {sentence}\nOutput:",
    input_variables=["sentence"],
)

对于负阈值：Selector按ngram重叠分数对示例进行排序，不排除任何示例。对于大于1.0的阈值：选择器排除所有示例，并返回一个空列表。对于等于0.0的阈值：Selector根据ngram重叠分数对示例进行排序，并且排除与输入没有ngram重叠的那些。

similarity 选择器
#

该对象根据与输入的相似性来选择示例。它通过查找与输入具有最大余弦相似度的嵌入示例来实现这一点。

from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

#创建反义词的任务的示例。
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # 可供选择的示例列表。
    examples,
    # 用于生成嵌入的嵌入类，这些嵌入类用于衡量语义相似性。
    OpenAIEmbeddings(),
    #用于存储嵌入并进行相似度搜索的VectorStore类。
    Chroma,
    k=1,
)
similar_prompt = FewShotPromptTemplate(
    # 我们提供一个 ExampleSelector
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)