学习笔记（十三）：用Tensorflow识别垃圾邮件

最新推荐文章于 2024-02-18 16:58:39 发布

林咚咚

最新推荐文章于 2024-02-18 16:58:39 发布

阅读量1.5k

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/weixin_39878297/article/details/84204354

版权

1.数据集的收集清洗

找一个入门级的垃圾邮件分类训练集，如SpamBase(下载传送门：http://archive.ics.uci.edu/ml/machine-learning-databases/spambase/)，提取58个属性，最后一位是垃圾邮件的标志位，其余用空格隔开。

def load_SpamBase(filename):
    x=[]
    y=[]
    with open(filename) as f:
        for line in f:
            line=line.strip('\n')
            v=line.split(',')
            y.append(int(v[-1]))
            t=[]
            for i in range(57):
                t.append(float(v[i]))
            t=np.array(t)
            x.append(t)

    x=np.array(x)
    y=np.array(y)
    print x.shape
    print y.shape

    x_train, x_test, y_train, y_test=train_test_split( x,y, test_size=0.4, random_state=0)
    print x_train.shape
    print x_test.shape
    return x_train, x_test, y_train, y_test

2.分别使用朴素贝叶斯

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

林咚咚

关注关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
学习笔记（十三）：用Tensorflow识别垃圾邮件

1.数据集的收集清洗找一个入门级的垃圾邮件分类训练集，如SpamBase(下载传送门：http://archive.ics.uci.edu/ml/machine-learning-databases/spambase/)，提取58个属性，最后一位是垃圾邮件的标志位，其余用空格隔开。def load_SpamBase(filename): x=[] y=[] ...
复制链接

扫一扫