机器学习--准备数据与Numpy(二)

最新推荐文章于 2023-03-13 17:19:03 发布

四果汤多加陈皮才酸爽

最新推荐文章于 2023-03-13 17:19:03 发布

阅读量370

点赞数

分类专栏： # Numpy 文章标签： Numpy Python 机器学习

Numpy 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

一、利用数组进行数据处理

1.NumPy数组使你可以将许多种数据处理任务表述为简洁的数组表达式（否则需要编写循环）。用数组表达式代替循环的做法，通常被称为矢量化。

2.矢量化数组运算要比等价的纯Python方式快上一两个数量级。

matplotlib画图具体实例

# -*- coding: utf-8 -*-

import matplotlib.pyplot as plt
import numpy as np
import pylab

points = np.arange(-5, 5, 0.01) # 生成100个点
# np.meshgrid(x,y) 函数输入为一维向量，输出为一个网格，以某点为中心半径范围内的网格
xs, ys = np.meshgrid(points, points)  # xs, ys互为转置矩阵
print(xs)
print('-----------------我只是一条完美的分割线----------------')
print(ys)
print('-----------------我只是一条完美的分割线----------------')
z = np.sqrt(xs ** 2 + ys ** 2)
print(z)
# 画图函数imshow,输入参数z为数据源，控制参数cmap表示生成图片类型，这里gray为灰度图
plt.imshow(z, cmap = plt.cm.gray);
# 设置颜色渐变条，图形最右边0-7的颜色渐变条
plt.colorbar()
# 设置图形标题
plt.title("Image plot of $\sqrt{x^2 + y^2}$ for a grid of values")
# 图像显示
pylab.show()

[[-5.   -4.99 -4.98 ...,  4.97  4.98  4.99]

 [-5.   -4.99 -4.98 ...,  4.97  4.98  4.99]
 [-5.   -4.99 -4.98 ...,  4.97  4.98  4.99]
 ..., 
 [-5.   -4.99 -4.98 ...,  4.97  4.98  4.99]
 [-5.   -4.99 -4.98 ...,  4.97  4.98  4.99]
 [-5.   -4.99 -4.98 ...,  4.97  4.98  4.99]]
-----------------我只是一条完美的分割线----------------
[[-5.   -5.   -5.   ..., -5.   -5.   -5.  ]
 [-4.99 -4.99 -4.99 ..., -4.99 -4.99 -4.99]
 [-4.98 -4.98 -4.98 ..., -4.98 -4.98 -4.98]
 ..., 
 [ 4.97  4.97  4.97 ...,  4.97  4.97  4.97]
 [ 4.98  4.98  4.98 ...,  4.98  4.98  4.98]
 [ 4.99  4.99  4.99 ...,  4.99  4.99  4.99]]
-----------------我只是一条完美的分割线----------------
[[ 7.07106781  7.06400028  7.05693985 ...,  7.04988652  7.05693985
   7.06400028]
 [ 7.06400028  7.05692568  7.04985815 ...,  7.04279774  7.04985815
   7.05692568]
 [ 7.05693985  7.04985815  7.04278354 ...,  7.03571603  7.04278354
   7.04985815]
 ..., 
 [ 7.04988652  7.04279774  7.03571603 ...,  7.0286414   7.03571603
   7.04279774]
 [ 7.05693985  7.04985815  7.04278354 ...,  7.03571603  7.04278354
   7.04985815]
 [ 7.06400028  7.05692568  7.04985815 ...,  7.04279774  7.04985815
   7.05692568]]

二、利用数组进行数据处理将条件逻辑表述为数组运算

1、列表推导的局限性

A、无法应用于高维数组

B、纯Python代码，速度不够快

2、where和where的嵌套

numpy.where(condition[, x, y])

1、这里x,y是可选参数，condition是条件，这三个输入参数都是array_like的形式；而且三者的维度相同

2、当conditon的某个位置的为true时，输出x的对应位置的元素，否则选择y对应位置的元素；

3、如果只有参数condition，则函数返回为true的元素的坐标位置信息；

where函数的逻辑=for循环 + if判断

实现类似于：

for ( i = 0; i < num(array); i++)

{

r[i] = cond[i] ? a[i] : b[i]

}

# -*- coding: utf-8 -*-

import numpy as np
import numpy.random as np_random

'''
关于zip函数的一点解释，zip可以接受任意多参数，然后重新组合成1个tuple列表。
zip([1, 2, 3], [4, 5, 6], [7, 8, 9])
返回结果：[(1, 4, 7), (2, 5, 8), (3, 6, 9)]
'''
print('通过真值表选择元素')
x_arr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
y_arr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])
result = [(x if c else y) for x, y, c in zip(x_arr, y_arr, cond)] # 通过列表推导实现
print(result)
print(np.where(cond, x_arr, y_arr))  # 使用NumPy的where函数
# 上面的两个结果是一样的，所以where()函数的作用是根据cond逻辑值为true选择参数x_arr，如果为false选择参数y_arr
print('-----------------It is just a dividing line.------------------')

print('更多where的例子')
arr = np_random.randn(4, 4)
print(arr)
print(np.where(arr > 0, 2, -2))
print(np.where(arr > 0, 2, arr))
print('-----------------It is just a dividing line.------------------')

print('where嵌套')
cond_1 = np.array([True, False, True, True, False])
cond_2 = np.array([False, True, False, True, False])
# 传统代码如下
result = []
# python2.x 的xrange就是python3.x的range，都是迭代器不再是枚举列表。
for i in range(len(cond)):
    if cond_1[i] and cond_2[i]:
        result.append(0)
    elif cond_1[i]:
        result.append(1)
    elif cond_2[i]:
        result.append(2)
    else:
        result.append(3)
print(result)
# np版本代码
result = np.where(cond_1 & cond_2, 0, \
          np.where(cond_1, 1, np.where(cond_2, 2, 3)))
print(result)

通过真值表选择元素
[1.1000000000000001, 2.2000000000000002, 1.3, 1.3999999999999999, 2.5]
[ 1.1  2.2  1.3  1.4  2.5]
-----------------It is just a dividing line.------------------
更多where的例子
[[-0.07481239 -2.26793356 -0.36476135  1.26520446]
 [ 0.8869949  -1.47082107  0.5020557  -1.30422553]
 [ 0.62391739 -0.10304553  0.41570237  0.26579114]
 [ 1.33762121 -0.17479677 -0.00611915  1.04369295]]
[[-2 -2 -2  2]
 [ 2 -2  2 -2]
 [ 2 -2  2  2]
 [ 2 -2 -2  2]]
[[-0.07481239 -2.26793356 -0.36476135  2.        ]
 [ 2.         -1.47082107  2.         -1.30422553]
 [ 2.         -0.10304553  2.          2.        ]
 [ 2.         -0.17479677 -0.00611915  2.        ]]
-----------------It is just a dividing line.------------------
where嵌套
[1, 2, 1, 0, 3]
[1 2 1 0 3]

四果汤多加陈皮才酸爽

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习--准备数据与Numpy(二)

一、利用数组进行数据处理1.NumPy数组使你可以将许多种数据处理任务表述为简洁的数组表达式（否则需要编写循环）。用数组表达式代替循环的做法，通常被称为矢量化。2.矢量化数组运算要比等价的纯Python方式快上一两个数量级。matplotlib画图具体实例 # -*- coding: utf-8 -*-import matplotlib.pyplot as pltimport numpy a...
复制链接

扫一扫