本地微信数据库构建

手机出了点问题,故想要迁移微信聊天记录,搜索发现了一个github上开源的项目,对于迁移微信数据、查看和导出聊天记录、分析数据生成可视化年报等十分有帮助。

项目地址:https://github.com/LC044/WeChatMsg/blob/master/

一、简易上手版操作

1. 下载可执行文件MemoTrace-1.1.1.exe

百度网盘链接:https://pan.baidu.com/s/1rzRPLaI50ETXvWqGdZVAhw 
提取码:2y1w

双击运行后界面如下:

2. 迁移手机聊天记录

注:如对电脑上的聊天记录操作,可不执行该步骤

操作步骤:手机微信->我->设置->聊天->聊天记录迁移与备份->迁移-> 迁移到电脑微信(迁移完成后重启微信)

3. 设置路径并获取信息

(1)首先设置微信路径,可以到微信->设置->文件管理查看

(2)选择wxid的文件夹中其一: 

 (3)点“获取信息”

(4)然后点开始启动,随后根据提示退出并重启该.exe文件

4. 导出数据

可选择三种导出方式,如下图所示。

其中批量导出方式,可选择部分或全部联系人,导出多种格式的多种消息类型。HTML格式有着和原对话框相同的消息形式,看着非常舒服,如下图所示:

5. 查看分析报告

选择左侧一栏的“好友”,可选择联系人查看“统计信息”、“情感分析”、“年度报告”等

 注:可将MemoTrace-1.1.1.exe和聊天记录文件夹wxid_.... 保存在自己的网盘,以后可以永久保存并随时随地观看,省去了硬件保存烦恼。也可以直接将聊天记录保存为html等格式放在自己的网盘。

二、对开源项目代码二次开发

上述MemoTrace-1.1.1.exe对应的源代码存放在如下百度网盘

链接:https://pan.baidu.com/s/1wyR-nSWY443VredgS2UQaQ 
提取码:974a

三、数据可视化

可对导出的.csv文件进行画柱状图、饼状图、热力图等

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib.font_manager import FontProperties
 
df = pd.read_csv(r"C:\Users\x\Desktop\data\聊天记录\****_utf8.csv", sep=',')
 
 
#每月消息数量趋势
df['month'] = pd.to_datetime(df['StrTime']).dt.month
month_counts = df['month'].value_counts().sort_index()
scaled_sizes = month_counts * 0.08
plt.figure(facecolor='white')
plt.title('Figure 1: Monthly Trends in Message Counts', fontname='Times New Roman',fontsize=22)
plt.xlabel('Month', fontname='Times New Roman',fontsize=20)
plt.ylabel('Messages', fontname='Times New Roman',fontsize=20)
plt.xticks(range(1, 13), fontname='Times New Roman',fontsize=15)
plt.yticks(fontname='Times New Roman',fontsize=15)
plt.scatter(month_counts.index, month_counts.values, color='#80BCBD', marker='o',s=scaled_sizes)
plt.grid(True, linestyle='solid', linewidth=1, color='lightgrey',axis='y')
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('chat_month.png',dpi=100)
plt.show()
 
 
 
#每月消息数量趋势
df['month_XiaoSong'] = pd.to_datetime(df[df['IsSender'] == 1]['StrTime']).dt.month
df['month_TiTi'] = pd.to_datetime(df[df['IsSender'] == 0]['StrTime']).dt.month
labels = ['TiTi', 'XiaoSong']
colors = ['#FFC0D9', '#8ACDD7']
month_counts_xiaosong = df['month_XiaoSong'].value_counts().sort_index()
month_counts_titi = df['month_TiTi'].value_counts().sort_index()
max_xiaosong = month_counts_xiaosong.max()
max_month_xiaosong = month_counts_xiaosong.idxmax()
max_titi = month_counts_titi.max()
max_month_titi = month_counts_titi.idxmax()
month_counts_titi.plot(kind='line', marker='o', label='TiTi',color='#FFC0D9')
month_counts_xiaosong.plot(kind='line', marker='o', label='XiaoSong',color='#8ACDD7')
plt.annotate(f'Max: {max_titi}', xy=(max_month_titi, max_titi), xytext=(max_month_titi + 0.5, max_titi + 10),
             arrowprops=dict(facecolor='black', arrowstyle='->'),
             fontsize=18,fontname='Times New Roman')
plt.annotate(f'Max: {max_xiaosong}', xy=(max_month_xiaosong, max_xiaosong), xytext=(max_month_xiaosong + 0.4, max_xiaosong + 10),
             arrowprops=dict(facecolor='black', arrowstyle='->'),
             fontsize=18,fontname='Times New Roman')
plt.title('Figure 2: Trends in Monthly Message Counts', fontname='Times New Roman',fontsize=22)
plt.xlabel('Month', fontname='Times New Roman',fontsize=20)
plt.ylabel('Messages', fontname='Times New Roman',fontsize=20)
plt.xticks(range(1, 13), fontname='Times New Roman',fontsize=15)
plt.yticks(fontname='Times New Roman',fontsize=15)
plt.grid(True, linestyle='solid', linewidth=0.5, color='lightgrey')
font_prop = FontProperties(family='Times New Roman')
plt.legend(labels, loc="best",prop=font_prop)
plt.tight_layout()
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('chat_plot.png',dpi=100)
plt.show()
 
 
 
 
 
 
#饼图
value_counts = df['IsSender'].value_counts()
percentages = 100. * value_counts / value_counts.sum()
labels = ['XiaoSong', 'TiTi']
colors = ['#8ACDD7', '#FFC0D9']
explode = (0.1, 0)
plt.figure(figsize=(8, 8))
def func(pct, allvals):
    absolute = int(pct/100.*np.sum(allvals))
    return f"{pct:.1f}%\n({absolute:d})"
plt.pie(value_counts, explode=explode, labels=labels, colors=colors,
        autopct=lambda pct: func(pct, value_counts), shadow=True, startangle=80, textprops={'style':'italic' , 'fontsize': 18})
plt.title('Figure 3: Distribution of Messages: TiTi vs. XiaoSong', fontname='Times New Roman',fontsize=22)
font_prop = FontProperties(family='Times New Roman')
plt.legend(labels, loc="best",prop=font_prop)
plt.axis('equal')
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('chat_pie',dpi=100)
plt.show()
 
 
 
 
#一周内消息的分布
dates = pd.to_datetime(df['StrTime'])
weekdays = dates.dt.day_name()
weekday_counts = weekdays.value_counts()
colors = ['#FF90BC', '#FFC0D9', '#F9F9E0', '#8ACDD7', '#EEE7DA', '#88AB8E', '#AFC8AD']
explode = (0.1, 0, 0, 0, 0, 0, 0)
plt.figure(figsize=(8, 8))
plt.pie(weekday_counts, explode=explode, labels=weekday_counts.index, colors=colors, autopct='%1.1f%%', shadow=True, startangle=90,textprops={'fontsize': 18})
plt.title('Figure 4: The Distribution of Messages during the Week', fontname='Times New Roman',fontsize=22)
font_prop = FontProperties(family='Times New Roman')
plt.legend(labels=weekday_counts.index, loc="best",prop=font_prop)
plt.axis('equal')
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('chat_pie_2',dpi=100)
plt.show()
 
 
 
#一天中的消息分布
df['hour'] = pd.to_datetime(df['StrTime']).dt.hour
plt.title('Figure 5: The Distribution of Messages throughout the Day', fontname='Times New Roman',fontsize=18)
plt.xlabel('Time', fontname='Times New Roman',fontsize=18)
plt.ylabel('Number of messages', fontname='Times New Roman',fontsize=18)
sns.set_style('darkgrid')
sns.histplot(df['hour'],bins=24,kde=True, color='lightcoral')
plt.xticks(np.arange(0, 25, 1.0), fontname='Times New Roman',fontsize=15)
plt.yticks(fontname='Times New Roman',fontsize=15)
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('chat_time.png',dpi=100)
plt.show()
 
 
 
#每个月消息数量
df['Date'] = pd.to_datetime(df['StrTime'])
df.set_index('Date', inplace=True)
monthly_counts = {}
for month in range(1, 13):
    month_str = f'2023-{month:02d}'
    month_df = df.loc[month_str]
    daily_count = month_df.resample('D').size()
    monthly_counts[month_str] = daily_count
plt.figure(figsize=(12, 8))
labels = ['2023-01', '2023-02', '2023-03', '2023-04', '2023-05', '2023-06', '2023-07', '2023-08', '2023-09', '2023-10', '2023-11', '2023-12']
colors = ['#FF9843', '#3468C0', '#D63484', '#402B3A','#1f77b4', '#ff7f0e', '#2ca02c', '#d62728','#9467bd', '#3468C0', '#8c564b', '#17becf']
for idx, (month, count_data) in enumerate(monthly_counts.items()):
    plt.plot(count_data.index.day, count_data.values, marker='o', linestyle='-', color=colors[idx], label=month)
    max_value = count_data.max()
    max_day = count_data.idxmax().day
    plt.annotate(f'Max: {max_value}', xy=(max_day, max_value), xytext=(max_day + 1.2, max_value + 1),
                 arrowprops=dict(facecolor='black', arrowstyle='->'),
                 fontsize=18, fontname='Times New Roman')
plt.title('Figure 6: The Number of Messages Distributed Each Month from January 2023 to December 2023',
          fontname='Times New Roman', fontsize=22)
plt.xlabel('Day', fontname='Times New Roman', fontsize=20)
plt.ylabel('Messages', fontname='Times New Roman', fontsize=20)
plt.xticks(range(1, 32), fontname='Times New Roman', fontsize=15)  # 设置x轴标签
plt.yticks(fontname='Times New Roman', fontsize=15)
font_prop = FontProperties(family='Times New Roman')
plt.legend(labels, loc="best", prop=font_prop)
plt.grid(True, linestyle='solid', linewidth=0.5, color='lightgrey')  # 添加网格线
plt.tight_layout()
fig = plt.gcf()
fig.set_size_inches(15, 8)
fig.savefig('chat_plot_2.png', dpi=100)
plt.show()
 
 
 
 
 
#热力图1
df['Date'] = pd.to_datetime(df['StrTime']).dt.date
daily_counts = df['Date'].value_counts().reset_index()
daily_counts.columns = ['Date', 'Chat_Count']
heatmap_data = daily_counts.pivot_table(index='Date', values='Chat_Count', aggfunc='sum')
plt.figure(figsize=(14, 10))
sns.heatmap(heatmap_data, cmap="Reds",linewidths=0.5, linecolor='gray',xticklabels=False)
plt.title('Figure 7: Chat Counts Heatmap', fontname='Times New Roman',fontsize=22)
plt.ylabel('Date', fontname='Times New Roman',fontsize=20)
plt.yticks(fontname='Times New Roman')
plt.tight_layout()
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('heatmap_1.png',dpi=100)
plt.show()
 
 
 
 
#热力图2
df['Date'] = pd.to_datetime(df['StrTime'])
df['Month'] = df['Date'].dt.month
heatmap_data = df.pivot_table(index=df['Date'].dt.day, columns='Month', values='StrTime', aggfunc='count')
sns.heatmap(heatmap_data, cmap="GnBu", linewidths=0.5, linecolor='gray')
plt.title('Figure 8: Chat Counts Heatmap by Month', fontname='Times New Roman',fontsize=22)
plt.xlabel('Month', fontname='Times New Roman',fontsize=20)
plt.ylabel('Day of Month', fontname='Times New Roman',fontsize=20)
plt.xticks(fontname='Times New Roman',fontsize=15) 
plt.yticks(fontname='Times New Roman',fontsize=15)
plt.tight_layout()
fig = plt.gcf()
fig.set_size_inches(15,8)
fig.savefig('heatmap_2.png',dpi=100)
plt.show()

  • 5
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值