作业2020.8.29

最新推荐文章于 2022-12-07 21:49:02 发布

QiuBeiXianSeng

最新推荐文章于 2022-12-07 21:49:02 发布

阅读量223

点赞数

分类专栏：练习题作业数据分析文章标签： python 数据分析

本文链接：https://blog.csdn.net/QiuBeiXianSeng/article/details/108292924

版权

数据分析同时被 3 个专栏收录

10 篇文章 1 订阅

订阅专栏

练习题

7 篇文章 0 订阅

订阅专栏

作业

7 篇文章 0 订阅

订阅专栏

在这里插入图片描述
以上为英国与美国Youtube数据，其每列对应的是：点击，喜欢，不喜欢，评论。

练习1
结合Matplotlib绘制各自的评论数量的图形，体现其评论数主要分布在哪个区间。

代码如下：

import numpy as np
import matplotlib.pyplot as plt
import matplotlib  # 设置字体
font = {
    'family':'SimHei',
    'weight':'bold',
    'size':12
}
matplotlib.rc("font", **font)

# 读取英国数据
GB_comment = np.loadtxt('GB_video_data_numbers.csv',delimiter=',',usecols=(3), unpack=True)

#读取美国数据
US_comment = np.loadtxt('US_video_data_numbers.csv',delimiter=',',usecols=(3), unpack=True)

# 用直方图分析
a = 50000  #间隔

GB_max = max(GB_comment)
GB_min = min(GB_comment)
GB_bins = (GB_max-GB_min)/a

US_max = max(US_comment)
US_min = min(US_comment)
US_bins = (US_max-GB_min)/a

plt.figure(figsize=(20,10))

plt.subplot(1,2,1)
plt.hist(GB_comment,int(GB_bins),density=1,color='red')
plt.title('英国评论分布',size=16)
x_tick1 = []
for i in range(int(GB_bins)+2):
    x_tick1.append(a*i)
plt.xticks(x_tick1,[f'{int(i/10000)}万' for i in x_tick1],rotation=30)
plt.xlabel('评论数',size=15)
plt.ylabel('百分比',size=15)
plt.text 
    

plt.subplot(1,2,2)
plt.hist(US_comment,int(US_bins),density=1,color='yellow')
plt.title('美国评论分布')
x_tick2 = []
for i in range(int(GB_bins)+2):
    x_tick2.append(a*i)
plt.xticks(x_tick2,[f'{int(i/10000)}万' for i in x_tick2],rotation=30)
plt.xlabel('评论数',size=15)
plt.ylabel('百分比',size=15)

plt.show()

如图所示：
在这里插入图片描述
练习2
绘制图形，分析英国的Youtube中视频的评论数与喜欢数的关系。

代码如下：

#读取英国数据
GB_like,GB_comment = np.loadtxt('GB_video_data_numbers.csv',delimiter=',',usecols=(1,3), unpack=True)


#设置画布大小
plt.figure(figsize=(20,20))
#画数据
plt.scatter(GB_like,GB_comment,marker='*',facecolor='red',linewidths=2)


#设置标题
plt.title('英国的Youtube中视频的评论数与喜欢数的关系',fontsize=30)



#设置坐标轴名称
plt.xlabel('喜欢数量',fontdict={'color':'black','size':30})
plt.ylabel('评论数量',fontdict={'color':'black','size':30})

plt.show()

如图所示：

在这里插入图片描述
练习3
当希望将两个国家的数据拼接一起来研究分析。
• 拼接全为0的数组标识为英国
• 拼接全为1的数组标识为美国
• 将两个国家的数据拼接

代码展示：

#读取英国数据
GB = np.loadtxt('GB_video_data_numbers.csv',delimiter=',', unpack=True).T
#读取美国数据
US = np.loadtxt('US_video_data_numbers.csv',delimiter=',', unpack=True).T

lab_GB = np.zeros_like(GB)
GB_new = np.hstack((lab_GB,GB))

lab_US = np.ones_like(US)
US_new = np.hstack((lab_US,US))

answer = np.vstack((GB_new,US_new))
answer

如图所示;
在这里插入图片描述
练习4
使用xmind模块复习numpy的整个模块

第一次使用有点不太会

在这里插入图片描述

QiuBeiXianSeng

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
作业2020.8.29

以上为英国与美国Youtube数据，其每列对应的是：点击，喜欢，不喜欢，评论。练习1结合Matplotlib绘制各自的评论数量的图形，体现其评论数主要分布在哪个区间。代码如下：import numpy as npimport matplotlib.pyplot as pltimport matplotlib # 设置字体font = { 'family':'SimHei', 'weight':'bold', 'size':12}matplotlib.rc("fo.
复制链接

扫一扫

专栏目录