python函数转换_将可读性公式转换为python函数

I was given this formula called FRES (Flesch reading-ease test) that is used to measure the readability of a document:

My task is to write a python function that returns the FRES of a text. Hence I need to convert this formula into a python function.

I have re-implemented my code from a answer I got to show what I have so far and the result it has given me:

import nltk

import collections

nltk.download('punkt')

nltk.download('gutenberg')

nltk.download('brown')

nltk.download('averaged_perceptron_tagger')

nltk.download('universal_tagset')

import re

from itertools import chain

from nltk.corpus import gutenberg

VC = re.compile('[aeiou]+[^aeiou]+', re.I)

def count_syllables(word):

return len(VC.findall(word))

def compute_fres(text):

"""Return the FRES of a text.

>>> emma = nltk.corpus.gutenberg.raw('austen-emma.txt')

>>> compute_fres(emma) # doctest: +ELLIPSIS

99.40...

"""

for filename in gutenberg.fileids():

sents = gutenberg.sents(filename)

words = gutenberg.words(filename)

num_sents = len(sents)

num_words = len(words)

num_syllables = sum(count_syllables(w) for w in words)

score = 206.835 - 1.015 * (num_words / num_sents) - 84.6 * (num_syllables / num_words)

return(score)

After running the code this is the result message I got:

Failure

Expected :99.40...

Actual :92.84866041488623

File "C:/Users/PycharmProjects/a1/a1.py", line 60, in a1.compute_fres

Failed example:

compute_fres(emma) # doctest: +ELLIPSIS

Expected:

99.40...

Got:

92.84866041488623

My function is supposed to pass the doctest and result in 99.40... And I'm also not allowed to edit the syllables function since it came with the task:

import re

VC = re.compile('[aeiou]+[^aeiou]+', re.I)

def count_syllables(word):

return len(VC.findall(word))

This question has being very tricky but at least now it's giving me a result instead of an error message, not sure why it's giving me a different result though.

Any help will be very appreciated. Thank you.

解决方案

BTW, there's the textstat library.

from textstat.textstat import textstat

from nltk.corpus import gutenberg

for filename in gutenberg.fileids():

print(filename, textstat.flesch_reading_ease(filename))

If you're bent on coding up your own, first you've to

decide if a punctuation is a word

define how to count no. of syllables in the word.

If punctuation is a word and syllables is counted by the regex in your question, then:

import re

from itertools import chain

from nltk.corpus import gutenberg

def num_syllables_per_word(word):

return len(re.findall('[aeiou]+[^aeiou]+', word))

for filename in gutenberg.fileids():

sents = gutenberg.sents(filename)

words = gutenberg.words(filename) # i.e. list(chain(*sents))

num_sents = len(sents)

num_words = len(words)

num_syllables = sum(num_syllables_per_word(w) for w in words)

score = 206.835 - 1.015 * (num_words / num_sents) - 84.6 * (num_syllables / num_words)

print(filename, score)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值