学习笔记 Day 25(numpy)

最新推荐文章于 2024-11-09 21:51:27 发布

Tomorrow'sThinker

最新推荐文章于 2024-11-09 21:51:27 发布

阅读量104

点赞数

分类专栏： python 基础文章标签： python numpy

本文链接：https://blog.csdn.net/a_Loki/article/details/122501152

版权

python 基础专栏收录该内容

25 篇文章 0 订阅

订阅专栏

数据类型:

dtype: 指定数据类型

astype:修改数据类型

round:保留小数(数据 ,2(几位小数))

np.shape() : 查看数组情况

np.reshape(): 修改数组情况

.flatten:数组展平

广播机制:

shape为(3,3,2)的数组可以和(3,2),(1,2),(3,1),(3,3)(在某一维度)进行运算,

shape为(3,3,2)的数组不能和(3,2)进行计算.

轴:(axis)

转置:(T)

np.arange(24).reshape(4,6).T

numpy读取数据:

# numpy 读取数据 
import numpy as np

t1 = np.loadtxt('./data/US_video_data_numbers.csv',delimiter=',',dtype='int',unpack=True) 
# loadtxt 读取数据 delimiter 分割 dtype 指定数据类型  unpack 行列互换
t2 = np.loadtxt('./data/US_video_data_numbers.csv',delimiter=',',dtype='int')
print(t1)
print("*"*100)
t2

numpy中的索引和切片:

t2[,:,] 冒号前面取行,冒号后面取列.非连续的行列t2[[2,4,6],:] 中括号包含需要取的行号

取行列t2[2,3],多行多列t2[2:4,1:3],取多个不相邻的点,t2[[0,1],[0,3]],(取的是(0,0),(1,3)).

numpy三元运算符:

np.where(t<10,10,2)

# 当t中的值小于十的时候,替换成10,否则替换成2

t.clip(10,18)

# t中小于10的换成10,大于18的换成18

numpy中的nan:

numpy中常用的统计函数:

numpy替换nan值:

# 替换numpy中的nan

import  numpy as np



def mean_nan(t1):
    for i in range(t1.shape[1]):
        temp_col = t1[:, i]
        nan_num = np.count_nonzero(temp_col != temp_col)
        if nan_num != 0: # 非零  有nan值
            temp_not_nan = temp_col[temp_col == temp_col]
            # 不为nan的列(只有中括号里面为True,对应的值才能被打印出来

            temp_col[np.isnan(temp_col)] = temp_not_nan.mean()
            print(temp_not_nan.mean())
            print(temp_col[np.isnan(temp_col)])
        
    return t1

if __name__ == '__main__':
    t1 = np.arange(12).reshape(3, 4).astype('float')
    print(t1)

    t1[1:, 3:] = np.nan

    print("*" * 100)
    print(t1)

    t1 = mean_nan(t1)
    print(t1)

结果:

对上面读取的数据进行可视化展示:

# 对读取的数据可视化(直方图)
import matplotlib.pyplot as plt

# 只取评论数
# t2 = t2[:,-1]

# 查看最大值,最小值
print(t2.max(),t2.min())

# 规定组距
d = 10000

# 确定组数
num_bins = (t2.max()-t2.min())//d

# 绘制直方图
plt.hist(t2,num_bins)

plt.show()

结果:

由图可知,大于5000的数据比较小,影响判断,

剪切数据,查看最大最小值:

t2 = t2[t2<5000]

# 查看最大值,最小值
print(t2.max(),t2.min())

结果:

规定新的组距,图的大小,


# 规定组距
d = 250

# 确定组数
num_bins = (t2.max()-t2.min())//d

# 画布大小
plt.figure(figsize=(20,8),dpi=80)

# 绘制直方图
plt.hist(t2,num_bins)

plt.show()

结果:

数组的拼接:

np.hstack(水平拼接)

np.vstack(竖直拼接)

ti([2,1],:) = ti([1,2],:) 行交换

ti(:,[0,2]) = ti(:,[2,0]) 列交换

更多方法:

numpy生成随机数:

Tomorrow'sThinker

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录