torchtext包由数据处理实用程序和自然语言的流行数据集组成。
(1) Pipeline (传递途径)
'''
Defines a pipeline for transforming sequence data.
The input is assumed to be utf-8 encoded str (Python 3) or unicode (Python 2).
定义用于转换序列数据的管道。
假设输入是utf-8编码的str(Python 3)或unicode(Python 2)。
'''
class torchtext.data.Pipeline(convert_token=None)
'''
Variables:
convert_token – The function to apply to input sequence data.
(适用于输入序列数据的函数。)
pipes – The Pipelines that will be applied to input sequence data in order.
(管道-将按顺序应用于输入序列数据的管道)
'''
'''
Create a pipeline. (创建一个管道。)
Parameters =>:
convert_token – The function to apply to input sequence data. If None, the identity function is used. Default: None
(适用于输入序列数据的函数。如果无,则使用身份函数。默认:None)
'''
__init__(convert_token=None)
# Add a Pipeline to be applied after this processing pipeline.
# 添加在此处理管道后应用的管道。
'''
Parameters =>:
pipeline – The Pipeline or callable to apply after this Pipeline.
(管道-管道或可调用在此管道之后申请。)
'''
add_after(pipeline)
# Add a Pipeline to be applied before this processing pipeline.
# 在此处理管道之前添加要应用的管道。
'''
Parameters =>:
pipeline – The Pipeline or callable to apply before this Pipeline.
(管道-管道或可在此管道之前申请。)
'''
add_before(pipeline)
'''
Apply _only_ the convert_token function of the current pipeline to the input. If the input is a list, a list with the results of applying the convert_token function to all input elements is returned.
将当前管道的convert_token函数_only_应用于输入。
如果输入是一个列表,则返回一个
包含将convert_token函数应用于所有输入元素的
结果的列表。
'''
'''
Parameters =>:
x – The input to apply the convert_token function to.
(x – 应用 convert_token 函数的输入。)
arguments (Positional) – Forwarded to the convert_token function of the current Pipeline.
(-转发到当前管道的convert_token函数。)
'''
call(x, *args)
'''
Return a copy of the input.
This is here for serialization compatibility with pickle.
返回输入的副本。
这是为了与泡菜的序列化兼容性。
'''
static identity(x)