python3.6 爬取微信好友列表和个性签名,绘制个性签名云图

python3.6 爬取微信好友列表和个性签名,绘制个性签名云图

1. 简要介绍

本次实验主要用到下面几个库 : 

 1)itchat---用于微信接口,实现生成QR码,用于微信扫描登陆

 2)re(正则化)---由于微信好友个性签名含有中英文,本次只提取中文,需要使用re模块去除其他无关字符

 3)wordcloud(云图)---使用该模块生成中文云图

 4)jieba(中文分词)--- 号称最好的中文分词工具

2. 安装以上几个库

pip install re
pip install jieba
pip install itchat
pip install wordcloud

3. 实验代码

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# @Time    : 2018/1/19 14:37
# @Author  : Z.C.Wang
# @Email   : 
# @File    : spider_wechat.py
# @Software: PyCharm Community Edition
"""
Description :

"""
import re
import jieba
import itchat
from pandas import DataFrame
import matplotlib.pyplot as plt
from wordcloud import WordCloud, ImageColorGenerator
import numpy as np
import PIL.Image as Image
import pickle

def get_var(var):
    variable = []
    for i in friends:
        value = i[var]
        variable.append(value)
    return variable

def list2str(wordlist):
    string = ' '
    for word in wordlist:
        string = string + ' ' + word
    return string

if __name__ == '__main__':
    itchat.login()
    friends = itchat.get_friends(update=True)
    male = female = other = 0
    for i in friends[1:]:
        sex = i['Sex']
        if sex == 1: male += 1
        elif sex == 2: female += 1
        else: other += 1
    total = len(friends[1:])
    # print('男性好友:%.2f%%' % float(male/total*100))
    # print('女性好友:%.2f%%' % float(female/total*100))
    # print('不明性别好友:%.2f%%' % float(other/total*100))
    Nickname = get_var('NickName')
    Sex = get_var('Sex')
    Province = get_var('Province')
    print(Province)
    City = get_var('City')
    Signature = get_var('Signature')
    data = {'Nickname': Nickname, 'Sex': Sex, 'Province': Province,
            'City': City, 'Signature': Signature}
    pickle.dump(data, open('data.txt', 'wb'))
    frame = DataFrame(data)
    frame.to_csv('info.csv', index=True, encoding='utf-8-sig')

    siglist = []
    for i in friends:
        signature = i['Signature'].strip().replace('spam', '').replace('class', '').replace('emoji', '')
        # rep = re.compile('1f\d+\w*|[<>/=]')
        rep = re.compile("[^\u4e00-\u9fa5^]")
        signature = rep.sub('', signature)
        siglist.append(signature)
    text = ''.join(siglist)
    wordlist = jieba.cut(text, cut_all=True)
    wordlist = list(wordlist)
    String = list2str(wordlist)

    coloring = np.array(Image.open('alice.png'))
    my_wordcloud = WordCloud(background_color='white', max_words=2000,
                             mask=coloring, max_font_size=55, random_state=42,
                             scale=2, font_path=r'C:\Windows\Fonts\simhei.ttf').generate(String)
    image_colors = ImageColorGenerator(coloring)
    plt.imshow(my_wordcloud.recolor(color_func=image_colors))
    plt.imshow(my_wordcloud)
    plt.axis('off')
    plt.show()
4. 实验结果



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值