成功解决 AttributeError: ‘Field‘ object has no attribute ‘vocab‘

最新推荐文章于 2024-03-11 01:20:15 发布

_Meilinger_

最新推荐文章于 2024-03-11 01:20:15 发布

阅读量1.9k

点赞数 1

分类专栏：问题清除指南文章标签： python pytorch debug torchtext Field

本文链接：https://blog.csdn.net/qq_36332660/article/details/131589715

版权

问题清除指南专栏收录该内容

30 篇文章 4 订阅

订阅专栏

最近复现代码过程中，需要用到 torchtext.data 中的 Field 类。本篇博客记录使用过程中的问题及解决方式。

注意 torchtext 版本不宜过新

在较新版本的 torchtext.data 里面并没有 Field 方法，这一点需要注意。

启示：在复现别人代码时，应同时复制他们使用环境的版本信息。

运行下述代码：

from torchtext.data import Field

SRC = Field(tokenize = tokenize_en, 
            init_token = '<sos>', 
            eos_token = '<eos>',
            fix_length = max_length,
            lower = True, 
            batch_first = True,
            sequential=True)

TRG = Field(tokenize = tokenize_en, 
            init_token = '<sos>', 
            eos_token = '<eos>', 
            fix_length = max_length,
            lower = True, 
            batch_first = True,
            sequential=True)

print(SRC.vocab.stoi["<sos>"])
print(TRG.vocab.stoi["<sos>"])

报错信息：

print(SRC.vocab.stoi["<sos>"])  # 2
AttributeError: 'Field' object has no attribute 'vocab'

于是查看 Field 类的定义，寻找和词表建立相关的函数，发现其 build_vocab() 函数中有建立词表的操作， build_vocab() 函数定义如下：

class Field(RawField):
	
	...
    
    def build_vocab(self, *args, **kwargs):
        """Construct the Vocab object for this field from one or more datasets.

        Arguments:
            Positional arguments: Dataset objects or other iterable data
                sources from which to construct the Vocab object that
                represents the set of possible values for this field. If
                a Dataset object is provided, all columns corresponding
                to this field are used; individual columns can also be
                provided directly.
            Remaining keyword arguments: Passed to the constructor of Vocab.
        """
        counter = Counter()
        sources = []
        for arg in args:
            if isinstance(arg, Dataset):
                sources += [getattr(arg, name) for name, field in
                            arg.fields.items() if field is self]
            else:
                sources.append(arg)
        for data in sources:
            for x in data:
                if not self.sequential:
                    x = [x]
                try:
                    counter.update(x)
                except TypeError:
                    counter.update(chain.from_iterable(x))
        specials = list(OrderedDict.fromkeys(
            tok for tok in [self.unk_token, self.pad_token, self.init_token,
                            self.eos_token] + kwargs.pop('specials', [])
            if tok is not None))
        self.vocab = self.vocab_cls(counter, specials=specials, **kwargs)
	
	...

解决方式：在程序中 Field 定义后添加 SRC.build_vocab() 和 TRG.build_vocab()，程序变成：

SRC.build_vocab()
TRG.build_vocab()

print(SRC.vocab.stoi["<sos>"])  # 输出结果：2
print(TRG.vocab.stoi["<sos>"])  # 输出结果：2

至此，程序就会顺利执行啦！

参考资料

_Meilinger_

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
成功解决 AttributeError: ‘Field‘ object has no attribute ‘vocab‘

本篇博客记录使用 torchtext.data 过程中的问题及解决方式。
复制链接

扫一扫

专栏目录

成功解决 AttributeError: ‘Field‘ object has no attribute ‘vocab‘

“相关推荐”对你有帮助么？