from better_profanity import profanity # profanity means dirty words# 审查元素时忽视大小写情况# profanity.censor function"""
替换脏话
1. profanity.censor可以审查文本中的脏话,默认会将每个脏话替代成四个星号字符(****)
被审查后的文本 censored_text_1 中:
- 脏话Fuck和jerk都会被替换为****
2. profanity.censor不担心单词(脏话)之间的分隔符,无论是(,/./_),但是除了(@/*/'/"/$/)
被审查后的文本 censored_text_2 中:
- 脏话Fuck与其它字符以逗号隔开了,脏话jerk与其它字符以点和下划线隔开了,但是它们仍然能被此方法审查到
3. profanity.censor的自定义审查可以改变替换字符(将星号替换为其它的字符)
观察函数 profanity.censor(self, text, censor_char="*") 可以得知关键字参数censor_char可以替换默认样式
被审查后的文本 censored_text_3 中:
- 脏话Fuck和jerk都将被替换为----
"""
censored_text_1 = profanity.censor('Fuck You’re a jerk!')print(censored_text_1)# ****,You’re a ****!
censored_text_2 = profanity.censor('Fuck,You’re a_jerk.!')print(censored_text_2)# ****,You’re a_****.!
censored_text_3 = profanity.censor('Fuck,You’re a jerk!','-')print(censored_text_3)# ----,You’re a ----!# profanity.contains_profanity() function"""
判断字符串是否含有脏话
可以检查字符串中是否含有脏话,含有则返回True,否则返回False
- Fuck You’re a jerk!
return True
- You are a good boy.
return False
"""print(profanity.contains_profanity('Fuck You’re a jerk!'))# Trueprint(profanity.contains_profanity('You are a good boy.'))# False# profanity.load_censor_words() function# profanity.load_censor_words_from_file() function"""
加载单组审查单词(不管通过下列哪种方法都只能加载一次)
load_censor_words(custom_bad_words_list) 可以将当前列表的单词加载到审查库中
- You are a good boy.
很明显不会被审查出脏话,但是当我们将boy和good添加到脏话库中,那么就会被审查为脏话了
profanity.load_censor_words_from_file(my_bad_words_file)
- You are a good boy.
很明显不会被审查出脏话,但是当我们将You和are添加到脏话库中,那么就会被审查为脏话了
取消加载审查单词
- 直接调用 profanity.load_censor_words()
"""
custom_bad_words_list =['good','boy']# custom_bad_words_list(自定义的脏话列表)
profanity.load_censor_words(custom_bad_words_list)print(profanity.contains_profanity('You are a good boy.'))# True
censored_text_4 = profanity.censor('You are a good boy.')print(censored_text_4)# You are a **** ****.
profanity.load_censor_words_from_file('my_bad_words.txt')
censored_text_5 = profanity.censor('You are a good boy.')print(censored_text_5)# **** **** a good boy.# profanity.load_censor_words() function# profanity.load_censor_words_from_file() function"""
白名单单词(使其暂时不成为脏话) --- 关键字 whitelist_words
"""
custom_bad_words_list =['good','boy']
profanity.load_censor_words(custom_bad_words_list, whitelist_words=['good'])
censored_text_6 = profanity.censor('You are a good boy.')print(censored_text_6)# You are a good ****.
profanity.load_censor_words_from_file('my_bad_words.txt', whitelist_words=['are'])
censored_text_7 = profanity.censor('You are a good boy.')print(censored_text_7)# **** are a good boy.# profanity.add_censor_words() function"""
添加更多审查单词(可以增加多次)
"""
profanity.load_censor_words_from_file('my_bad_words.txt', whitelist_words=['are'])
profanity.add_censor_words(custom_bad_words_list)
censored_text_8 = profanity.censor('You are a good boy.')print(censored_text_8)# **** are a **** ****.# Limitations"""
审查单词是按照单个字符来进行的,那么就很容易通过添加单个字符来绕过审查
"""
profanity.load_censor_words()
censored_text_9 = profanity.censor('Fuck,You’re a jerk!')print(censored_text_9)# ****,You’re a ****!
censored_text_10 = profanity.censor('Fuckk,You’re a jerkk!')print(censored_text_10)# Fuckk,You’re a jerkk!