Rasa Core Policy策略组件解析

最新推荐文章于 2024-09-29 08:32:22 发布

思念可是反

最新推荐文章于 2024-09-29 08:32:22 发布

阅读量1.6k

点赞数 4

分类专栏： rasa框架文章标签： nlp 人工智能

本文链接：https://blog.csdn.net/m0_54929869/article/details/123643145

版权

rasa框架专栏收录该内容

4 篇文章

订阅专栏

RASA CORE Policy (策略组件)

RASA NLU模块提供了用户消息中意图、槽位等信息，RASA DST模块提供了对话跟踪功能，记录了用户的历史消息，Policy要根据这些信息预测出，下一步机器人应该给出的动作是什么？可以是一个简单的回复，也可能是fallback等。默认情况下，RASA可以为每个用户消息预测10个后续操作。要更新此值，可以将环境变量MAX_NUMBER_OF_PREDICTIONS设置为所需的最大预测数。

RASA提供了多个Policy供选择，而且可以同时选择多个策略，由RASA Agent模块统一调度。在每一轮对话中，每个Policy都会给出自己预测的下一个Action，并给出置信度，然后Agent会选出最高置信度对应的action。当多个置信度相同的时候，RASA有自己的优先级：

6 - RulePolicy
3 - MemoizationPolicy or AugmentedMemoizationPolicy
1 - TEDPolicy

但是不推荐同个优先级的policy配置2个以上，比如MemoizationPolicy和AugmentedMemoizationPolicy同时使用，因为这样如果2个预测的置信度相同的时候，会导致结果是随机的。

如果是自定义的策略，请使用这些策略配置项里的priority参数指定策略优先级。如果您的策略是一个机器学习策略，那么它很可能具有优先级1，与TEDPolicy相同。

机器学习策略

TED Policy

Transformer 嵌入对话 (TED) 策略是一种用于下一步动作预测和实体识别的多任务架构。该架构由两个任务共享的几个转换器编码器组成。通过在用户序列转换器编码器输出之上的条件随机场 (CRF) 标记层预测实体标签序列，该输出对应于输入的令牌序列。对于下一个动作预测，对话转换器编码器输出和系统动作标签被嵌入到单个语义向量空间中。我们使用点积损失来最大化与目标标签的相似性并最小化与负样本的相似性。

该策略采用预定义架构，包括以下步骤：

将每轮用户输入（用户意图和实体），系统返回的动作，槽和活动都通过级联，变形等方式嵌入到输入向量中。
将输入向量输入到transformer。
在transformer的输出上应用全连接层，以获取每轮的对话嵌入。
应用全连接层为每轮的action创建嵌入特征。
计算对话特征和系统动作特征的相似度

TED原理参考：

Dialogue Transformers——RASA TED policy 论文翻译

在模块化训练中，我们使用平衡批处理策略来减轻不同类别的不平衡，因为有些系统动作要比其他动作频繁得多。

TED的使用，同其他配置项一样，包括配置策略名称，和对应的参数，如

recipe: default.v1
language:  # your language
pipeline:
  # - <pipeline components>

policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 200
  - name: RulePolicy

epochs：此参数设置算法将看到训练数据的次数（默认值：1）。一个epoch等于所有训练示例的一个前向传递和一个后向传递。有时模型需要更多的 epoch 才能正确学习。有时更多的时期不会影响性能。时期数越少，模型训练得越快。
max_history：此参数控制模型查看多少对话历史来决定下一步要采取的行动。此策略的默认max_history值为None，这意味着自会话重新启动以来的完整对话历史记录被考虑在内。如果你想限制模型只看到一定数量的先前对话轮次，你可以设置max_history为一个有限值。请注意，您应该max_history谨慎选择，以便模型有足够的先前对话轮次来创建正确的预测。有关更多详细信息，请参阅特征化器。
number_of_transformer_layers：此参数设置序列转换器编码器层的数量，用于用户、动作和动作标签文本的序列转换器编码器以及对话转换器编码器。（默认值：）text: 1, action_text: 1, label_action_text: 1, dialogue: 1。序列转换器编码器层的数量对应于用于模型的转换器块。
transformer_size：此参数设置序列转换器编码器层中的单元数，用于用户、动作和动作标签文本的序列转换器编码器以及对话转换器编码器。（默认值：）text: 128, action_text: 128, label_action_text: 128, dialogue: 128。从变压器编码器出来的向量将具有给定的transformer_size.
connection_density：此参数定义模型中所有前馈层设置为非零值的内核权重的比例（默认值：0.2）。该值应介于 0 和 1 之间。如果设置connection_density 为 1，则不会将内核权重设置为 0，该层充当标准前馈层。您不应设置connection_density为 0，因为这将导致所有内核权重为 0，即模型无法学习。
split_entities_by_comma：此参数定义是否应将用逗号分隔的相邻实体视为一个或拆分。例如，类型为的实体ingredients，如“apple,banana”，可以拆分为“apple”和“banana”。类型为的实体 address，例如“Schönhauser Allee 175, 10119 Berlin”应被视为一个实体。
constrain_similarities：此参数设置为True对所有相似项应用 sigmoid 交叉熵损失。这有助于将输入标签和负标签之间的相似性保持为较小的值。这应该有助于将模型更好地推广到现实世界的测试集。
model_confidence：此参数允许用户配置在推理过程中如何计算置信度。它可以取两个值：
- softmax：置信度在范围内[0, 1]（旧行为和当前默认值）。计算的相似性使用softmax激活函数进行归一化。
- linear_norm: 信心在范围内[0, 1]。计算的点积相似度使用线性函数进行归一化。
请尝试linear_norm用作的值model_confidence。这应该更容易处理以低置信度预测的动作。

更多可配置参数：

+---------------------------------------+------------------------+--------------------------------------------------------------+
| Parameter                             | Default Value          | Description                                                  |
+=======================================+========================+==============================================================+
| hidden_layers_sizes                   | text: []               | Hidden layer sizes for layers before the embedding layers    |
|                                       | action_text: []        | for user messages and bot messages in previous actions       |
|                                       | label_action_text: []  | and labels. The number of hidden layers is                   |
|                                       |                        | equal to the length of the corresponding list.               |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| dense_dimension                       | text: 128              | Dense dimension for sparse features to use after they are    |
|                                       | action_text: 128       | converted into dense features.                               |
|                                       | label_action_text: 128 |                                                              |
|                                       | intent: 20             |                                                              |
|                                       | action_name: 20        |                                                              |
|                                       | label_action_name: 20  |                                                              |
|                                       | entities: 20           |                                                              |
|                                       | slots: 20              |                                                              |
|                                       | active_loop: 20        |                                                              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| concat_dimension                      | text: 128              | Common dimension to which sequence and sentence features of  |
|                                       | action_text: 128       | different dimensions get converted before concatenation.     |
|                                       | label_action_text: 128 |                                                              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| encoding_dimension                    | 50                     | Dimension size of embedding vectors                          |
|                                       |                        | before the dialogue transformer encoder.                     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| transformer_size                      | text: 128              | Number of units in user text sequence transformer encoder.   |
|                                       | action_text: 128       | Number of units in bot text sequence transformer encoder.    |
|                                       | label_action_text: 128 | Number of units in bot text sequence transformer encoder.    |
|                                       | dialogue: 128          | Number of units in dialogue transformer encoder.             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| number_of_transformer_layers          | text: 1                | Number of layers in user text sequence transformer encoder.  |
|                                       | action_text: 1         | Number of layers in bot text sequence transformer encoder.   |
|                                       | label_action_text: 1   | Number of layers in bot text sequence transformer encoder.   |
|                                       | dialogue: 1            | Number of layers in dialogue transformer encoder.            |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| number_of_attention_heads             | 4                      | Number of self-attention heads in transformers.              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| unidirectional_encoder                | True                   | Use a unidirectional or bidirectional encoder                |
|                                       |                        | for `text`, `action_text`, and `label_action_text`.          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_key_relative_attention            | False                  | If 'True' use key relative embeddings in attention.          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_value_relative_attention          | False                  | If 'True' use value relative embeddings in attention.        |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| max_relative_position                 | None                   | Maximum position for relative embeddings.                    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| batch_size                            | [64, 256]              | Initial and final value for batch sizes.                     |
|                                       |                        | Batch size will be linearly increased for each epoch.        |
|                                       |                        | If constant `batch_size` is required, pass an int, e.g. `8`. |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| batch_strategy                        | "balanced"             | Strategy used when creating batches.                         |
|                                       |                        | Can be either 'sequence' or 'balanced'.                      |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| epochs                                | 1                      | Number of epochs to train.                                   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| random_seed                           | None                   | Set random seed to any 'int' to get reproducible results.    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| learning_rate                         | 0.001                  | Initial learning rate for the optimizer.                     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| embedding_dimension                   | 20                     | Dimension size of dialogue & system action embedding vectors.|
+---------------------------------------+------------------------+--------------------------------------------------------------+
| number_of_negative_examples           | 20                     | The number of incorrect labels. The algorithm will minimize  |
|                                       |                        | their similarity to the user input during training.          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| similarity_type                       | "auto"                 | Type of similarity measure to use, either 'auto' or 'cosine' |
|                                       |                        | or 'inner'.                                                  |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| loss_type                             | "cross_entropy"        | The type of the loss function, either 'cross_entropy'        |
|                                       |                        | or 'margin'.                                                 |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| ranking_length                        | 0                      | Number of top actions to include in prediction. Confidences  |
|                                       |                        | of all other actions will be set to 0. Set to 0 to let the   |
|                                       |                        | prediction include confidences for all actions.              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| renormalize_confidences               | False                  | Normalize the top predictions. Applicable only with loss     |
|                                       |                        | type 'cross_entropy' and 'softmax' confidences.              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| maximum_positive_similarity           | 0.8                    | Indicates how similar the algorithm should try to make       |
|                                       |                        | embedding vectors for correct labels.                        |
|                                       |                        | Should be 0.0 < ... < 1.0 for 'cosine' similarity type.      |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| maximum_negative_similarity           | -0.2                   | Maximum negative similarity for incorrect labels.            |
|                                       |                        | Should be -1.0 < ... < 1.0 for 'cosine' similarity type.     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_maximum_negative_similarity       | True                   | If 'True' the algorithm only minimizes maximum similarity    |
|                                       |                        | over incorrect intent labels, used only if 'loss_type' is    |
|                                       |                        | set to 'margin'.                                             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| scale_loss                            | True                   | Scale loss inverse proportionally to confidence of correct   |
|                                       |                        | prediction.                                                  |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| regularization_constant               | 0.001                  | The scale of regularization.                                 |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| negative_margin_scale                 | 0.8                    | The scale of how important it is to minimize the maximum     |
|                                       |                        | similarity between embeddings of different labels.           |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| drop_rate_dialogue                    | 0.1                    | Dropout rate for embedding layers of dialogue features.      |
|                                       |                        | Value should be between 0 and 1.                             |
|                                       |                        | The higher the value the higher the regularization effect.   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| drop_rate_label                       | 0.0                    | Dropout rate for embedding layers of label features.         |
|                                       |                        | Value should be between 0 and 1.                             |
|                                       |                        | The higher the value the higher the regularization effect.   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| drop_rate_attention                   | 0.0                    | Dropout rate for attention. Value should be between 0 and 1. |
|                                       |                        | The higher the value the higher the regularization effect.   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| connection_density                    | 0.2                    | Connection density of the weights in dense layers.           |
|                                       |                        | Value should be between 0 and 1.                             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_sparse_input_dropout              | True                   | If 'True' apply dropout to sparse input tensors.             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_dense_input_dropout               | True                   | If 'True' apply dropout to sparse features after they are    |
|                                       |                        | converted into dense features.                               |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| evaluate_every_number_of_epochs       | 20                     | How often to calculate validation accuracy.                  |
|                                       |                        | Set to '-1' to evaluate just once at the end of training.    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| evaluate_on_number_of_examples        | 0                      | How many examples to use for hold out validation set.        |
|                                       |                        | Large values may hurt performance, e.g. model accuracy.      |
|                                       |                        | Keep at 0 if your data set contains a lot of unique examples |
|                                       |                        | of dialogue turns.                                           |
|                                       |                        | Set to 0 for no validation.                                  |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| tensorboard_log_directory             | None                   | If you want to use tensorboard to visualize training         |
|                                       |                        | metrics, set this option to a valid output directory. You    |
|                                       |                        | can view the training metrics after training in tensorboard  |
|                                       |                        | via 'tensorboard --logdir <path-to-given-directory>'.        |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| tensorboard_log_level                 | "epoch"                | Define when training metrics for tensorboard should be       |
|                                       |                        | logged. Either after every epoch ('epoch') or for every      |
|                                       |                        | training step ('batch').                                     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| checkpoint_model                      | False                  | Save the best performing model during training. Models are   |
|                                       |                        | stored to the location specified by `--out`. Only the one    |
|                                       |                        | best model will be saved.                                    |
|                                       |                        | Requires `evaluate_on_number_of_examples > 0` and            |
|                                       |                        | `evaluate_every_number_of_epochs > 0`                        |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| e2e_confidence_threshold              | 0.5                    | The threshold that ensures that end-to-end is picked only if |
|                                       |                        | the policy is confident enough.                              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| featurizers                           | []                     | List of featurizer names (alias names). Only features        |
|                                       |                        | coming from the listed names are used. If list is empty      |
|                                       |                        | all available features are used.                             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| entity_recognition                    | True                   | If 'True' entity recognition is trained and entities are     |
|                                       |                        | extracted.                                                   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| constrain_similarities                | False                  | If `True`, applies sigmoid on all similarity terms and adds  |
|                                       |                        | it to the loss function to ensure that similarity values are |
|                                       |                        | approximately bounded.                                       |
|                                       |                        | Used only when `loss_type=cross_entropy`.                    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| model_confidence                      | "softmax"              | Affects how model's confidence for each action               |
|                                       |                        | is computed. Currently, only one value is supported:         |
|                                       |                        | 1. `softmax` - Similarities between input and action         |
|                                       |                        | embeddings are post-processed with a softmax function,       |
|                                       |                        | as a result of which confidence for all labels sum up to 1.  |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| BILOU_flag                            | True                   | If 'True', additional BILOU tags are added to entity labels. |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| split_entities_by_comma               | True                   | Splits a list of extracted entities by comma to treat each   |
|                                       |                        | one of them as a single entity. Can either be `True`/`False` |
|                                       |                        | globally, or set per entity type, such as:                   |
|                                       |                        | ```                                                        |
|                                       |                        | - name: TEDPolicy                                            |
|                                       |                        |   split_entities_by_comma:                                   |
|                                       |                        |     address: True                                            |
|                                       |                        | ```                                                        |
+---------------------------------------+------------------------+--------------------------------------------------------------+

UnexpecTED Intent Policy(意外意图策略)

2.8 中的新功能

此功能是实验性的。我们引入了实验性功能以获得社区的反馈，因此我们鼓励您尝试一下！但是，将来可能会更改或删除该功能。

UnexpecTEDIntentPolicy帮助您查看对话，并允许您的机器人对不太可能的用户轮换做出反应。它是一种辅助策略，只能与至少一个其他策略一起使用，因为它可以触发的唯一操作是特殊action_unlikely_intent操作。

UnexpecTEDIntentPolicy具有与相同的模型架构TEDPolicy。区别在于任务级别。与其学习接下来要触发的最佳动作， UnexpecTEDIntentPolicy不如从训练故事中学习给定对话上下文的用户最有可能表达的意图集。它通过检查 NLU 预测的意图是否是最可能的意图，在推理时使用学习到的信息。如果 NLU 预测的意图在给定对话上下文的情况下确实可能发生，UnexpecTEDIntentPolicy则不会触发任何动作。否则，它会触发action_unlikely_intent 一个置信度为的1.00。

UnexpecTEDIntentPolicy应被视为对TEDPolicy. 因为，TEDPolicy预计将通过更好地覆盖助手在训练数据中处理的独特对话路径来改进， UnexpecTEDIntentPolicy有助于从过去的对话中显示这些独特的对话路径。例如，如果您的训练数据中有以下故事：

stories:
- story: book_restaurant_table
  steps:
  - intent: request_restaurant
  - action: restaurant_form
  - active_loop: restaurant_form
  - action: restaurant_form
  - active_loop: null
  - slot_was_set:
    - requested_slot: null

但是实际的对话可能会在您没有考虑到的表格内遇到感叹词：

stories:
- story: actual_conversation
  steps:
  - user: |
        I'm looking for a restaurant.
    intent: request_restaurant
  - action: restaurant_form
  - active_loop: restaurant_form
  - slot_was_set:
    - requested_slot: cuisine
  - user: |
        Does it matter? I want to be quick.
    intent: deny

一旦deny意图被触发，处理表单的策略将继续请求cuisine填补空缺，因为培训故事并没有说应该区别对待这种情况。为了帮助您确定deny此时可能缺少处理用户意图的特殊故事， UnexpecTEDIntentPolicy可以在意图之后立即触发action_unlikely_intent操作deny。随后，您可以通过添加处理此特殊情况的新培训故事来改进您的助手。

为了减少错误警告，UnexpecTEDIntentPolicy在推理时有两种机制：

UnexpecTEDIntentPolicy的优先级故意低于所有基于规则的策略，因为规则可能存在于对 TEDPolicyor来说是新奇的情况UnexpecTEDIntentPolicy。
UnexpecTEDIntentPolicy如果最后预测的意图不存在于任何训练故事中，则不会预测，action_unlikely_intent如果意图仅用于规则中，则可能会发生这种情况。

UnexpecTEDIntentPolicy的预测action_unlikely_intent

UnexpecTEDIntentPolicy在用户话语之后立即调用，并且可以触发action_unlikely_intent或弃权（在这种情况下，其他策略将预测动作）。为了确定是否action_unlikely_intent应该触发，UnexpecTEDIntentPolicy计算当前对话上下文中用户意图的分数，并检查该分数是否低于某个阈值分数。

这个阈值分数是通过收集 ML 模型在许多“负面示例”上的输出来计算的。这些负面例子是不正确的对话上下文和用户意图的组合。UnexpecTEDIntentPolicy通过选择随机故事部分并将其与此时未发生的随机意图配对，从您的训练数据中生成这些负面示例。例如，如果您只有一个培训故事：

version: 2.0
stories:
- story: happy path 1
  steps:
  - intent: greet
  - action: utter_greet
  - intent: mood_great
  - action: utter_goodbye

和一个意图affirm，那么一个有效的负面例子将是：

version: 2.0
stories:
- story: negative example with affirm unexpected
  steps:
  - intent: greet
  - action: utter_greet
  - intent: affirm

在这里，affirm意图是出乎意料的，因为它不会出现在所有培训故事的特定对话上下文中。对于每个意图，UnexpecTEDIntentPolicy使用这些负面示例来计算模型预测的分数范围。阈值分数是从这个分数范围中挑选出来的，使得一定百分比的负例的预测分数高于阈值分数，因此action_unlikely_intent 不会为它们触发。这个负例百分比可以通过tolerance参数来控制。越高，在触发动作tolerance之前需要的意图得分越低（意图越不可能）。UnexpecTEDIntentPolicy``action_unlikely_intent

配置：

您可以将配置参数传递给UnexpecTEDIntentPolicy使用config.yml文件。如果要微调模型的性能，请从修改以下参数开始：

epochs：此参数设置算法将看到训练数据的次数（默认值：1）。一个epoch等于所有训练示例的一个前向传递和一个后向传递。有时模型需要更多的 epoch 才能正确学习。有时更多的时期不会影响性能。时期数越少，模型训练得越快。这是配置的样子：
```
policies:
- name: UnexpecTEDIntentPolicy
  epochs: 200
```
max_history：此参数控制模型在进行推理之前查看多少对话历史记录。此策略的默认max_history值为None，这意味着自会话（重新）开始以来的完整对话历史记录被考虑在内。如果你想限制模型只看到一定数量的先前对话轮次，你可以设置max_history为一个有限值。请注意，您应该max_history谨慎选择，以便模型有足够的先前对话轮次来创建正确的预测。根据您的数据集，更高的值max_history可能会导致更频繁的预测，action_unlikely_intent 因为随着更多对话上下文的考虑，唯一可能的对话路径的数量会增加。同样，降低的值max_history会导致action_unlikely_intent被触发的频率较低，但也可以是一个更强的指标，表明相应的对话路径是高度独特的，因此是出乎意料的。我们建议您将的设置max_history为UnexpecTEDIntentPolicy等于的TEDPolicy。这是配置的样子：

policies:
- name: UnexpecTEDIntentPolicy
  max_history: 8

ignore_intents_list：此参数允许您配置UnexpecTEDIntentPolicy为不对action_unlikely_intent意图子集进行预测。如果您遇到某个意图列表，其中生成了太多错误警告，您可能希望执行此操作。
tolerance:tolerance参数是一个范围从0.0到1.0（含）的数字。它有助于调整推理时预测期间使用的阈值分数。action_unlikely_intent

这里，0.0意味着阈值分数将以这样的方式进行调整，即0%在训练期间遇到的负样本被预测为低于阈值分数的分数。因此，来自所有负面示例的对话上下文将触发一个action_unlikely_intent动作。

容忍度0.1意味着阈值分数将以这样的方式进行调整，使得在训练期间遇到的 10% 的负样本被预测为低于阈值分数的分数。

容忍度1.0意味着阈值分数非常低，UnexpecTEDIntentPolicy不会触发action_unlikely_intent它在训练期间遇到的任何负面示例

更多可配置参数：

+---------------------------------------+------------------------+--------------------------------------------------------------+
| Parameter                             | Default Value          | Description                                                  |
+=======================================+========================+==============================================================+
| hidden_layers_sizes                   | text: []               | Hidden layer sizes for layers before the embedding layers    |
|                                       |                        | for user messages and bot messages in previous actions       |
|                                       |                        | and labels. The number of hidden layers is                   |
|                                       |                        | equal to the length of the corresponding list.               |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| dense_dimension                       | text: 128              | Dense dimension for sparse features to use after they are    |
|                                       | intent: 20             | converted into dense features.                               |
|                                       | action_name: 20        |                                                              |
|                                       | label_intent: 20       |                                                              |
|                                       | entities: 20           |                                                              |
|                                       | slots: 20              |                                                              |
|                                       | active_loop: 20        |                                                              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| concat_dimension                      | text: 128              | Common dimension to which sequence and sentence features of  |
|                                       |                        | different dimensions get converted before concatenation.     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| encoding_dimension                    | 50                     | Dimension size of embedding vectors                          |
|                                       |                        | before the dialogue transformer encoder.                     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| transformer_size                      | text: 128              | Number of units in user text sequence transformer encoder.   |
|                                       | dialogue: 128          | Number of units in dialogue transformer encoder.             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| number_of_transformer_layers          | text: 1                | Number of layers in user text sequence transformer encoder.  |
|                                       | dialogue: 1            | Number of layers in dialogue transformer encoder.            |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| number_of_attention_heads             | 4                      | Number of self-attention heads in transformers.              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| unidirectional_encoder                | True                   | Use a unidirectional or bidirectional encoder                |
|                                       |                        | for `text`, `action_text`, and `label_action_text`.          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_key_relative_attention            | False                  | If 'True' use key relative embeddings in attention.          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_value_relative_attention          | False                  | If 'True' use value relative embeddings in attention.        |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| max_relative_position                 | None                   | Maximum position for relative embeddings.                    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| batch_size                            | [64, 256]              | Initial and final value for batch sizes.                     |
|                                       |                        | Batch size will be linearly increased for each epoch.        |
|                                       |                        | If constant `batch_size` is required, pass an int, e.g. `8`. |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| batch_strategy                        | "balanced"             | Strategy used when creating batches.                         |
|                                       |                        | Can be either 'sequence' or 'balanced'.                      |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| epochs                                | 1                      | Number of epochs to train.                                   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| random_seed                           | None                   | Set random seed to any 'int' to get reproducible results.    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| learning_rate                         | 0.001                  | Initial learning rate for the optimizer.                     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| embedding_dimension                   | 20                     | Dimension size of dialogue & system action embedding vectors.|
+---------------------------------------+------------------------+--------------------------------------------------------------+
| number_of_negative_examples           | 20                     | The number of incorrect labels. The algorithm will minimize  |
|                                       |                        | their similarity to the user input during training.          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| ranking_length                        | 10                     | Number of top actions to normalize scores for. Applicable    |
|                                       |                        | only with loss type 'cross_entropy' and 'softmax'            |
|                                       |                        | confidences. Set to 0 to disable normalization.              |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| scale_loss                            | True                   | Scale loss inverse proportionally to confidence of correct   |
|                                       |                        | prediction.                                                  |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| regularization_constant               | 0.001                  | The scale of regularization.                                 |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| drop_rate_dialogue                    | 0.1                    | Dropout rate for embedding layers of dialogue features.      |
|                                       |                        | Value should be between 0 and 1.                             |
|                                       |                        | The higher the value the higher the regularization effect.   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| drop_rate_label                       | 0.0                    | Dropout rate for embedding layers of label features.         |
|                                       |                        | Value should be between 0 and 1.                             |
|                                       |                        | The higher the value the higher the regularization effect.   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| drop_rate_attention                   | 0.0                    | Dropout rate for attention. Value should be between 0 and 1. |
|                                       |                        | The higher the value the higher the regularization effect.   |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_sparse_input_dropout              | True                   | If 'True' apply dropout to sparse input tensors.             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| use_dense_input_dropout               | True                   | If 'True' apply dropout to sparse features after they are    |
|                                       |                        | converted into dense features.                               |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| evaluate_every_number_of_epochs       | 20                     | How often to calculate validation accuracy.                  |
|                                       |                        | Set to '-1' to evaluate just once at the end of training.    |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| evaluate_on_number_of_examples        | 0                      | How many examples to use for hold out validation set.        |
|                                       |                        | Large values may hurt performance, e.g. model accuracy.      |
|                                       |                        | Keep at 0 if your data set contains a lot of unique examples |
|                                       |                        | of dialogue turns.                                           |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| tensorboard_log_directory             | None                   | If you want to use tensorboard to visualize training         |
|                                       |                        | metrics, set this option to a valid output directory. You    |
|                                       |                        | can view the training metrics after training in tensorboard  |
|                                       |                        | via 'tensorboard --logdir <path-to-given-directory>'.        |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| tensorboard_log_level                 | "epoch"                | Define when training metrics for tensorboard should be       |
|                                       |                        | logged. Either after every epoch ('epoch') or for every      |
|                                       |                        | training step ('batch').                                     |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| checkpoint_model                      | False                  | Save the best performing model during training. Models are   |
|                                       |                        | stored to the location specified by `--out`. Only the one    |
|                                       |                        | best model will be saved.                                    |
|                                       |                        | Requires `evaluate_on_number_of_examples > 0` and            |
|                                       |                        | `evaluate_every_number_of_epochs > 0`                        |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| featurizers                           | []                     | List of featurizer names (alias names). Only features        |
|                                       |                        | coming from the listed names are used. If list is empty      |
|                                       |                        | all available features are used.                             |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| ignore_intents_list                   | []                     | This parameter lets you configure `UnexpecTEDIntentPolicy` to ignore|
|                                       |                        | the prediction of `action_unlikely_intent` for a subset of   |
|                                       |                        | intents. You might want to do this if you come across a      |
|                                       |                        | certain list of intents for which there are too many false   |
|                                       |                        | warnings generated.                                          |
+---------------------------------------+------------------------+--------------------------------------------------------------+
| tolerance                             | 0.0                    | The `tolerance` parameter is a number that ranges from `0.0` |
|                                       |                        | to `1.0` (inclusive). It helps to adjust the threshold score |
|                                       |                        | used during prediction of `action_unlikely_intent` at        |
|                                       |                        | inference time. Here, `0.0` means that the score threshold   |
|                                       |                        | is the one that `UnexpecTEDIntentPolicy` had determined at training |
|                                       |                        | time. A tolerance of `1.0` means that the threshold score    |
|                                       |                        | is so low that `IntentTED` would not trigger                 |
|                                       |                        | `action_unlikely_intent` for any of the "negative examples"  |
|                                       |                        | that it has encountered during training. These negative      |
|                                       |                        | examples are combinations of dialogue contexts and user      |
|                                       |                        | intents that are _incorrect_. `UnexpecTEDIntentPolicy` generates    |
|                                       |                        | these negative examples from your training data by picking a |
|                                       |                        | random story part and pairing it with a random intent that   |
|                                       |                        | doesn't occur at this point.                                 |
+---------------------------------------+------------------------+--------------------------------------------------------------+

在查看真实对话时，我们鼓励您调整配置中的tolerance参数UnexpecTEDIntentPolicy以减少错误警告的数量（实际上可能在对话上下文中给出的意图）。随着您以为单位增加tolerancefrom0到1的值0.05，错误警告的数量应该会减少。然而，增加tolerance也将导致更少的触发器，action_unlikely_intent因此训练故事中不存在的更多对话路径将在标记的对话集中丢失。如果您更改max_history值并重新训练模型，您可能还必须重新调整该tolerance值。

Memoization Policy(记忆策略)

记住你的训练数据中的MemoizationPolicy故事。它会检查当前对话是否与您 stories.yml文件中的故事相匹配。如果是这样，它将根据您的训练数据的匹配故事预测下一个动作，置信度为1.0. 如果没有找到匹配的对话，则策略会None自信地进行预测0.0。

在您的训练数据中寻找匹配项时，该策略将考虑 max_history对话的最后轮数。一个“回合”包括用户发送的消息以及助手在等待下一条消息之前执行的任何操作。

您可以配置应在配置中使用的圈数MemoizationPolicy：

policies:
  - name: "MemoizationPolicy"
    max_history: 3

Augmented Memoization Policy(增强记忆策略）

就像. _ AugmentedMemoizationPolicy_ 此外，它还有一个遗忘机制，可以忘记对话历史中的某些步骤，并尝试在您的故事中找到与减少的历史相匹配的内容。如果找到匹配，它会自信地预测下一个动作，否则它会自信地预测。max_history``MemoizationPolicy``1.0``None``0.0

插槽和预测

如果您的对话中在预测时间设置的某些槽可能未在训练故事中设置（例如，在以提醒开头的训练故事中，并非所有先前的槽都已设置），请确保将没有槽的相关故事添加到您的训练中数据也是如此。

基于规则的策略

RulePolicy(规则策略)

这RulePolicy是一种处理遵循固定行为（例如业务逻辑）的对话部分的策略。rules它根据您在训练数据中的任何内容进行预测。有关如何定义规则的更多信息，请参阅规则文档。

真正规则编写在data/rules.yml下。具体的规则编写，参考下文

Rasa Rules 文章解析

具有以下RulePolicy配置选项：

policies:
  - name: "RulePolicy"
    core_fallback_threshold: 0.3
    core_fallback_action_name: action_default_fallback
    enable_fallback_prediction: true
    restrict_rules: true
    check_for_contradictions: true

core_fallback_threshold（默认值：） 0.3：请参阅后备文档以获取更多信息。
core_fallback_action_name（默认值：） action_default_fallback：请参阅后备文档以获取更多信息。
enable_fallback_prediction（默认值：） true：请参阅后备文档以获取更多信息。
check_for_contradictions（默认值true：）：在训练之前，RulePolicy 将执行检查以确保操作设置的槽和活动循环对于所有规则都是一致定义的。以下代码段包含不完整规则的示例：

rules:
- rule: complete rule
  steps:
  - intent: search_venues
  - action: action_search_venues
  - slot_was_set:
    - venues: [{"name": "Big Arena", "reviews": 4.5}]

- rule: incomplete rule
  steps:
  - intent: search_venues
  - action: action_search_venues

在第二个incomplete rule中，action_search_venues应该设置venues插槽，因为它设置在中complete rule，但是缺少此事件。有几种可能的方法可以修复此规则。

在action_search_venues找不到场地venues且不应该设置插槽的情况下，您应该将插槽的值显式设置为null。在下面的故事中将仅在未设置插槽时进行RulePolicy预测：utter_venues_not_found``venues

rules:
- rule: fixes incomplete rule
  steps:
  - intent: search_venues
  - action: action_search_venues
  - slot_was_set:
    - venues: null
  - action: utter_venues_not_found

如果您希望槽设置由不同的规则或故事处理，您应该添加wait_for_user_input: false到规则片段的末尾：

rules:
- rule: incomplete rule
  steps:
  - intent: search_venues
  - action: action_search_venues
  wait_for_user_input: false

训练后，RulePolicy 将检查规则或故事是否相互矛盾。以下代码段是两个相互矛盾的规则的示例：

rules:
- rule: Chitchat
  steps:
  - intent: chitchat
  - action: utter_chitchat

- rule: Greet instead of chitchat
  steps:
  - intent: chitchat
  - action: utter_greet  # `utter_greet` contradicts `utter_chitchat` from the rule above

restrict_rules（默认值：true）：规则仅限于一个用户轮次，但可以有多个机器人事件，包括例如正在填写的表单及其后续提交。将此参数更改为false可能会导致意外行为。

注意：过度使用规则

随着复杂性的增加，出于推荐用例之外的目的过度使用规则将使您的助手变得非常难以维护。

策略相关配置

max_history(最大历史策略)

Rasa 开源策略的一个重要超参数是max_history. 这控制了模型查看多少对话历史来决定下一步要采取的行动。

您可以max_history通过将其传递给您的策略配置中的策略来设置config.yml. 默认值为None，这意味着帐户中包含自会话重新启动以来的完整对话历史记录。值得注意的是，RulePolicy并没有这个参数，它考虑的是整个对话历史。

例如，假设您有一个out_of_scope描述离题用户消息的意图。如果你的机器人连续多次看到这个意图，你可能想告诉用户你可以帮助他们做什么。所以你的故事可能是这样的：

stories:
  - story: utter help after 2 fallbacks
    steps:
    - intent: out_of_scope
    - action: utter_default
    - intent: out_of_scope
    - action: utter_default
    - intent: out_of_scope
    - action: utter_help_message

这个场景是连续3次都没有识别用户的意图，然后播放一个帮助列表。在这种情况下，max_history至少设置为4。

max_history值变大，会增加训练时长。有个技巧就是，可以把上文信息设置为slot，slot信息会在整个对话过程中有效。这样就可以做到不增加 max_history，而又更多的上文信息。

数据增加

在训练模型时，Rasa Core将随机地将故事文件中的故事粘在一起，从而创建更长的故事。

以下面的故事为例：

stories:
  - story: thank
    steps:
    - intent: thankyou
    - action: utter_youarewelcome
  - story: say goodbye
    steps:
    - intent: goodbye
    - action: utter_goodbye

您实际上想教您的策略在对话历史不相关时忽略它，并且无论之前发生什么都以相同的操作做出响应。为了实现这一点，将单个故事连接成更长的故事。thank从上面的示例中，数据增强可能会通过组合withsay goodbye和 then来产生一个故事thank，相当于：

stories:
  - story: thank -> say goodbye -> thank
    steps:
    - intent: thankyou
    - action: utter_youarewelcome
    - intent: goodbye
    - action: utter_goodbye
    - intent: thankyou
    - action: utter_youarewelcome

可以使用标志更改此行为，该--augmentation标志允许您设置augmentation_factor. 这augmentation_factor决定了在训练期间对多少个增强故事进行了二次抽样。增强的故事会在训练之前进行二次抽样，因为它们的数量会很快变得非常大，并且您想要限制它。抽样故事的数量是augmentation_factorx10。默认augmentation_factor设置为 50，最多可生成 500 个增强故事。

--augmentation 0禁用所有增强行为。TEDPolicy是唯一受增强影响的策略。其他策略喜欢MemoizationPolicy或RulePolicy 自动忽略所有增强的故事（不管augmentation_factor）。

--augmentation是尝试减少TEDPolicy训练时间时的重要参数。减少augmentation_factor训练数据的大小会减少训练策略的时间。但是，减少数据增强量也会降低TEDPolicy. TEDPolicy我们建议在减少数据扩充量时使用基于记忆的策略以进行补偿。

状态的特征化

在预测一个动作时，除了当前状态之外，还应该包含更多的历史记录。TrackerFeaturizer遍历跟踪器状态，并为每个状态调用SingleStateFeaturizer为策略创建数字输入特征。目标标签对应于bot操作或bot语句被表示为所有可能操作的列表中的索引。

跟踪器的单个状态特征化有两个步骤：

Tracker 提供了一系列活动功能：
- 指示意图和实体的特征，如果这是一个回合中的第一个状态，例如，这是我们在解析用户的消息后将采取的第一个动作。（例如 [intent_restaurant_search, entity_cuisine]）
- 指示当前定义了哪些位置的特征，例如， slot_location如果用户之前提到了他们正在搜索餐馆的区域。
- 指示存储在插槽中的任何 API 调用的结果的功能，例如slot_matches
- 指示最后一个机器人动作或机器人话语是什么的特征（例如 prev_action_listen）
- 指示是否有任何循环处于活动状态以及哪个循环的功能
将所有特征转换为数值向量：

SingleStateFeaturizer使用 Rasa NLU 管道将意图和机器人动作名称或机器人话语转换为数字向量。有关如何配置 Rasa NLU 管道的详细信息，请参阅NLU 模型配置文档。

实体、槽和活动循环被特征化为单热编码以指示它们的存在。

如果域定义了可能的actions, [ActionGreet, ActionGoodbye]，则添加 4 个额外的默认操作： [ActionListen(), ActionRestart(), ActionDefaultFallback(), ActionDeactivateForm()]. 因此，label0表示默认动作listen，label1 默认restart，label2表示greeting和3goodbye。

跟踪器功能器#

可以训练策略来学习两种标签 -

由助手触发的下一个最合适的操作。例如，TEDPolicy经过培训可以做到这一点。
用户可以表达的下一个最可能的意图。例如，UnexpecTEDIntentPolicy被训练来学习这个。

因此，可以对跟踪器进行特征化以学习上述标签之一。根据策略，目标标签对应于机器人动作或机器人话语，表示为所有可能动作列表中的索引，或表示为所有可能意图列表中的索引的意图集。

Tracker Featurizers 有三种不同的风格：

1. 完整对话#

FullDialogueTrackerFeaturizer创建故事的数字表示以馈送到循环神经网络，其中整个对话被馈送到网络并且梯度从所有时间步长反向传播。目标标签是应该在对话上下文中触发的最合适的机器人动作或机器人话语。TrackerFeaturizer迭代跟踪器状态并为每个状态调用 a来SingleStateFeaturizer为策略创建数字输入特征。

2.最大历史记录#

MaxHistoryTrackerFeaturizer操作非常相似，FullDialogueTrackerFeaturizer因为它为每个机器人动作或机器人话语创建了一组先前的跟踪器状态，但参数 max_history定义了每行输入特征中有多少状态。如果max_history未指定，则算法会考虑整个对话长度。执行重复数据删除以根据其先前状态过滤掉重复的轮次（机器人动作或机器人话语）。

对于某些算法，需要一个平面特征向量，因此输入特征应该重新整形为(num_unique_turns, max_history * num_input_features)。

3. Intent Max History #

IntentMaxHistoryTrackerFeaturizer继承自MaxHistoryTrackerFeaturizer。由于它由使用 UnexpecTEDIntentPolicy，它创建的目标标签是用户可以在对话跟踪器的上下文中表达的意图。与其他跟踪器特征化器不同，可以有多个目标标签。因此，它在目标标签列表-1的右侧填充一个常量值 ( )，以便为每个输入对话跟踪器返回一个大小相同的目标标签列表。

就像MaxHistoryTrackerFeaturizer，它还执行重复数据删除以过滤掉重复的回合。但是，它会为相应跟踪器的每个正确意图生成一个特征化跟踪器。例如，如果输入对话跟踪器的正确标签具有以下索引 - [0, 2, 4]，则特征化器将产生三对特征化跟踪器和目标标签。特征化跟踪器将彼此相同，但每对中的目标标签将是 [0, 2, 4]、[4, 0, 2]、[2, 4, 0]。

自定义策略

3.0 中的新功能

Rasa Open Source 3.0 统一了 NLU 组件和策略的实现。这需要更改为早期版本的 Rasa Open Source 编写的自定义策略。有关迁移的分步指南，请参阅迁移指南。

您还可以编写自定义策略并在配置中引用它们。在下面的示例中，最后两行显示了如何使用自定义策略类并将参数传递给它。有关自定义策略的完整指南，请参阅自定义图形组件指南。

policies:
  - name: "TEDPolicy"
    max_history: 5
    epochs: 200
  - name: "RulePolicy"
  - name: "path.to.your.policy.class"
    arg1: "..."