python3.7 nltk中collocations报错解决方法

学习自然语言处理,用到collocations()方法时报错

>>>text1.collocations()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "D:\python\test02\venv\lib\site-packages\nltk\text.py", line 444, in collocations
    w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
  File "D:\python\test02\venv\lib\site-packages\nltk\text.py", line 444, in <listcomp>
    w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
ValueError: too many values to unpack (expected 2)

于是找到D:\python\test02\venv\lib\site-packages\nltk\text.py 中的第444行

        collocation_strings = [
            w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
        ]

发现collocation_strings的值来自collocation_list,于是找到collocation_list

    def collocation_list(self, num=20, window_size=2):
        """
        Return collocations derived from the text, ignoring stopwords.

        :param num: The maximum number of collocations to return.
        :type num: int
        :param window_size: The number of tokens spanned by a collocation (default=2)
        :type window_size: int
        """
        if not (
            "_collocations" in self.__dict__
            and self._num == num
            and self._window_size == window_size
        ):
            self._num = num
            self._window_size = window_size

            # print("Building collocations list")
            from nltk.corpus import stopwords

            ignored_words = stopwords.words("english")
            finder = BigramCollocationFinder.from_words(self.tokens, window_size)
            finder.apply_freq_filter(2)
            finder.apply_word_filter(lambda w: len(w) < 3 or w.lower() in ignored_words)
            bigram_measures = BigramAssocMeasures()
            self._collocations = finder.nbest(bigram_measures.likelihood_ratio, num)
        return [w1 + " " + w2 for w1, w2 in self._collocations]

collocation_list的返回值是一个一维列表,自然不可能for循环出w1和w2,

解决方法
  • 方法一
    • 直接使用collocation_list
  • 方法二
    • 修改D:\python\test02\venv\lib\site-packages\nltk\text.py 中的源代码,如下,
        #     collocation_strings = [
        #     w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
        # ]
        collocation_strings = self.collocation_list(num, window_size)
    
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值