rspamd 学习spam/ham

学习为垃圾邮件 learn_spam

[root@umail Maildir]# rspamc -h localhost:11334 -P q1cc learn_spam <.Junk/cur/1440728579.M621653P19776.umail.westhost.cn\,S\=2440\,W\=2501\:2\,
Results for file: stdin
success = true;

学习为正常邮件 learn_ham

[root@umail cur]# rspamc -h localhost:11334 -P q1cc learn_ham <./1440729646.M846175P20103.umail.westhost.cn\,S\=1955\,W\=1999\:2\,
Results for file: stdin
success = true;

查看rspamd状态 rspamc stat

[root@umail cur]# rspamc stat
Results for command: stat
Messages scanned: 9
Messages with action reject: 0, 0.00%
Messages with action soft reject: 0, 0.00%
Messages with action rewrite subject: 0, 0.00%
Messages with action add header: 5, 55.55%
Messages with action greylist: 0, 0.00%
Messages with action no action: 4, 44.44%
Messages treated as spam: 5, 55.55%
Messages treated as ham: 4, 44.44%
Messages learned: 0
Connections count: 9
Control connections count: 27
Pools allocated: 74
Pools freed: 43
Bytes allocated: 1M
Memory chunks allocated: 162
Shared chunks allocated: 29
Chunks freed: 59
Oversized chunks: 1
Fuzzy hashes stored: 0
Fuzzy hashes expired: 0
Fuzzy hashes checked: 0 0 0
Fuzzy hashes found: 0 0 0
Statfile: BAYES_SPAM length: 50M; free blocks: 3M; total blocks: 3M; free: 100.00%; learned: 1
Statfile: BAYES_HAM length: 50M; free blocks: 3M; total blocks: 3M; free: 100.00%; learned: 1
Total learns: 2

转载于:https://my.oschina.net/hxily/blog/498482

SPAM/HAM数据集是用于垃圾邮件分类的英文数据集,可以用于机器学习模型的训练。这个数据集包含一个名为spam.csv的文件,其中包含用于对垃圾邮箱进行分类的数据。 如果你对这个数据集感兴趣,你可以在Kaggle上找到它,地址是https://www.kaggle.com/c/ds100fa19。在这个链接中,你可以找到相关的博文和一些关于垃圾邮件分类的练习。 当你读入数据时,可以使用pandas库来读取spam.csv文件,并将它分为训练集和测试集。具体的代码如下: ```python import pandas as pd import numpy as np train = pd.read_csv("train.csv") test = pd.read_csv("test.csv") train.head() ``` 如果你想了解数据集中是否存在无效的单元格,可以使用numpy库中的sum函数来计算train和test中无效单元格的数量。具体代码如下: ```python print(np.sum(np.array(train.isnull()==True), axis=0)) print(np.sum(np.array(test.isnull()==True), axis=0)) ``` 这样就可以得到train和test中无效单元格的数量了。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [spam-and-ham-dataset.zip](https://download.csdn.net/download/qq_32742431/12129001)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] - *2* *3* [[Kaggle] Spam/Ham Email Classification 垃圾邮件分类(RNN/GRU/LSTM)](https://blog.csdn.net/qq_21201267/article/details/111059250)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值