大数据最全【数模之数据分析-2】_数模 数据分析,2024年最新2024年不想被公司优化

img
img

网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。

需要这份系统化资料的朋友,可以戳这里获取

一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!

bz = help(np.info(np.add))
print(bz)

4-创建一个2~20的数组,并将其逆序

sz = np.arange(2, 21, 1)
print(sz)
sz = sz[::-1]
print(sz)

5-找到一个数组中不为0的索引

sy = np.nonzero([2, 53, 12, 43, 0, 0, 0, 23, 90])
print(sy)

相关程序运行如下:
1.22.3
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
200 bytes
add(x1, x2, /, out=None, \*, where=True, casting='same\_kind', order='K', dtype=None, subok=True[, signature, extobj])

Add arguments element-wise.

Parameters
----------
x1, x2 : array_like
    The arrays to be added.
    If ``x1.shape != x2.shape``, they must be broadcastable to a common
    shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If provided, it must have
    a shape that the inputs broadcast to. If not provided or None,
    a freshly-allocated array is returned. A tuple (possible only as a
    keyword argument) must have length equal to the number of outputs.
where : array_like, optional
    This condition is broadcast over the input. At locations where the
    condition is True, the `out` array will be set to the ufunc result.
    Elsewhere, the `out` array will retain its original value.
    Note that if an uninitialized `out` array is created via the default
    ``out=None``, locations within it where the condition is False will
    remain uninitialized.
\*\*kwargs
    For other keyword-only arguments, see the
    :ref:`ufunc docs <ufuncs.kwargs>`.

Returns
-------
add : ndarray or scalar
    The sum of `x1` and `x2`, element-wise.
    This is a scalar if both `x1` and `x2` are scalars.

Notes
-----
Equivalent to `x1` + `x2` in terms of array broadcasting.

Examples
--------
>>> np.add(1.0, 4.0)
5.0
>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> np.add(x1, x2)
array([[  0.,   2.,   4.],
       [  3.,   5.,   7.],
       [  6.,   8.,  10.]])

The ``+`` operator can be used as a shorthand for ``np.add`` on ndarrays.

>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> x1 + x2
array([[ 0.,  2.,  4.],
       [ 3.,  5.,  7.],
       [ 6.,  8., 10.]])
Help on NoneType object:

class NoneType(object)
 |  Methods defined here:
 |  
 |  \_\_bool\_\_(self, /)
 |      self != 0
 |  
 |  \_\_repr\_\_(self, /)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  \_\_new\_\_(\*args, \*\*kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.

None
[ 2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
[20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2]
(array([0, 1, 2, 3, 7, 8], dtype=int32),)



6-随机构造一个3*3矩阵,并打印其中最大与最小值

zz = np.random.random((3, 3))
print(zz.max())
print(zz.min())

7-构造一个5*5的矩阵,令其值都为1,并在最外层加上一圈0

jz = np.ones((5, 5))
jz = np.pad(jz, pad_width=1, mode='constant', constant_values=0)
print(jz)

print(help(np.pad))     # 帮助文档

8-构造一个shape为(6, 7, 8)的矩阵,并找到第100个元素的索引值

sy8 = np.unravel_index(100, (6, 7, 8))
print(sy8)

9-对一个5*5的矩阵做归一化操作

cz = np.random.random((5, 5))
cz_max = cz.max()
cz_min = cz.min()
cz = (cz-cz_min)/(cz_max-cz_min)
print(cz)

10-找到两个数组中相同的值

sz1 = np.random.randint(0, 20, 8)
sz2 = np.random.randint(0, 20, 8)
print(sz1)
print(sz2)

print(np.intersect1d(sz1, sz2))

相关程序运行如下:
0.9786237847073697
0.10837689046425514
[[0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 1. 1. 1. 1. 0.]
 [0. 1. 1. 1. 1. 1. 0.]
 [0. 1. 1. 1. 1. 1. 0.]
 [0. 1. 1. 1. 1. 1. 0.]
 [0. 1. 1. 1. 1. 1. 0.]
 [0. 0. 0. 0. 0. 0. 0.]]
Help on function pad in module numpy:

pad(array, pad_width, mode='constant', \*\*kwargs)
    Pad an array.
    
    Parameters
    ----------
    array : array_like of rank N
        The array to pad.
    pad_width : {sequence, array_like, int}
        Number of values padded to the edges of each axis.
        ((before_1, after_1), ... (before_N, after_N)) unique pad widths
        for each axis.
        ((before, after),) yields same before and after pad for each axis.
        (pad,) or int is a shortcut for before = after = pad width for all
        axes.
    mode : str or function, optional
        One of the following string values or a user supplied function.
    
        'constant' (default)
            Pads with a constant value.
        'edge'
            Pads with the edge values of array.
        'linear\_ramp'
            Pads with the linear ramp between end_value and the
            array edge value.
        'maximum'
            Pads with the maximum value of all or part of the
            vector along each axis.
        'mean'
            Pads with the mean value of all or part of the
            vector along each axis.
        'median'
            Pads with the median value of all or part of the
            vector along each axis.
        'minimum'
            Pads with the minimum value of all or part of the
            vector along each axis.
        'reflect'
            Pads with the reflection of the vector mirrored on
            the first and last values of the vector along each
            axis.
        'symmetric'
            Pads with the reflection of the vector mirrored
            along the edge of the array.
        'wrap'
            Pads with the wrap of the vector along the axis.
            The first values are used to pad the end and the
            end values are used to pad the beginning.
        'empty'
            Pads with undefined values.
    
            .. versionadded:: 1.17
    
        <function>
            Padding function, see Notes.
    stat_length : sequence or int, optional
        Used in 'maximum', 'mean', 'median', and 'minimum'.  Number of
        values at edge of each axis used to calculate the statistic value.
    
        ((before_1, after_1), ... (before_N, after_N)) unique statistic
        lengths for each axis.
    
        ((before, after),) yields same before and after statistic lengths
        for each axis.
    
        (stat_length,) or int is a shortcut for before = after = statistic
        length for all axes.
    
        Default is ``None``, to use the entire axis.
    constant_values : sequence or scalar, optional
        Used in 'constant'.  The values to set the padded values for each
        axis.
    
        ``((before_1, after_1), ... (before_N, after_N))`` unique pad constants
        for each axis.
    
        ``((before, after),)`` yields same before and after constants for each
        axis.
    
        ``(constant,)`` or ``constant`` is a shortcut for ``before = after = constant`` for
        all axes.
    
        Default is 0.
    end_values : sequence or scalar, optional
        Used in 'linear\_ramp'.  The values used for the ending value of the
        linear_ramp and that will form the edge of the padded array.
    
        ``((before_1, after_1), ... (before_N, after_N))`` unique end values
        for each axis.
    
        ``((before, after),)`` yields same before and after end values for each
        axis.
    
        ``(constant,)`` or ``constant`` is a shortcut for ``before = after = constant`` for
        all axes.
    
        Default is 0.
    reflect_type : {'even', 'odd'}, optional
        Used in 'reflect', and 'symmetric'.  The 'even' style is the
        default with an unaltered reflection around the edge value.  For
        the 'odd' style, the extended part of the array is created by
        subtracting the reflected values from two times the edge value.
    
    Returns
    -------
    pad : ndarray
        Padded array of rank equal to `array` with shape increased
        according to `pad_width`.
    
    Notes
    -----
    .. versionadded:: 1.7.0
    
    For an array with rank greater than 1, some of the padding of later
    axes is calculated from padding of previous axes.  This is easiest to
    think about with a rank 2 array where the corners of the padded array
    are calculated by using padded values from the first axis.
    
    The padding function, if used, should modify a rank 1 array in-place. It
    has the following signature::
    
        padding\_func(vector, iaxis_pad_width, iaxis, kwargs)
    
    where
    
        vector : ndarray
            A rank 1 array already padded with zeros.  Padded values are
            vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].
        iaxis_pad_width : tuple
            A 2-tuple of ints, iaxis_pad_width[0] represents the number of
            values padded at the beginning of vector where
            iaxis_pad_width[1] represents the number of values padded at
            the end of vector.
        iaxis : int
            The axis currently being calculated.
        kwargs : dict
            Any keyword arguments the function requires.
    
    Examples
    --------
    >>> a = [1, 2, 3, 4, 5]
    >>> np.pad(a, (2, 3), 'constant', constant_values=(4, 6))
    array([4, 4, 1, ..., 6, 6, 6])
    
    >>> np.pad(a, (2, 3), 'edge')
    array([1, 1, 1, ..., 5, 5, 5])
    
    >>> np.pad(a, (2, 3), 'linear\_ramp', end_values=(5, -4))
    array([ 5,  3,  1,  2,  3,  4,  5,  2, -1, -4])
    
    >>> np.pad(a, (2,), 'maximum')
    array([5, 5, 1, 2, 3, 4, 5, 5, 5])
    
    >>> np.pad(a, (2,), 'mean')
    array([3, 3, 1, 2, 3, 4, 5, 3, 3])
    
    >>> np.pad(a, (2,), 'median')
    array([3, 3, 1, 2, 3, 4, 5, 3, 3])
    
    >>> a = [[1, 2], [3, 4]]
    >>> np.pad(a, ((3, 2), (2, 3)), 'minimum')
    array([[1, 1, 1, 2, 1, 1, 1],
           [1, 1, 1, 2, 1, 1, 1],
           [1, 1, 1, 2, 1, 1, 1],
           [1, 1, 1, 2, 1, 1, 1],
           [3, 3, 3, 4, 3, 3, 3],
           [1, 1, 1, 2, 1, 1, 1],
           [1, 1, 1, 2, 1, 1, 1]])
    
    >>> a = [1, 2, 3, 4, 5]
    >>> np.pad(a, (2, 3), 'reflect')
    array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])
    
    >>> np.pad(a, (2, 3), 'reflect', reflect_type='odd')
    array([-1,  0,  1,  2,  3,  4,  5,  6,  7,  8])
    
    >>> np.pad(a, (2, 3), 'symmetric')
    array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])
    
    >>> np.pad(a, (2, 3), 'symmetric', reflect_type='odd')
    array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])
    
    >>> np.pad(a, (2, 3), 'wrap')
    array([4, 5, 1, 2, 3, 4, 5, 1, 2, 3])
    
    >>> def pad\_with(vector, pad_width, iaxis, kwargs):
    ...     pad_value = kwargs.get('padder', 10)
    ...     vector[:pad_width[0]] = pad_value
    ...     vector[-pad_width[1]:] = pad_value
    >>> a = np.arange(6)
    >>> a = a.reshape((2, 3))
    >>> np.pad(a, 2, pad_with)
    array([[10, 10, 10, 10, 10, 10, 10],
           [10, 10, 10, 10, 10, 10, 10],
           [10, 10,  0,  1,  2, 10, 10],
           [10, 10,  3,  4,  5, 10, 10],
           [10, 10, 10, 10, 10, 10, 10],
           [10, 10, 10, 10, 10, 10, 10]])
    >>> np.pad(a, 2, pad_with, padder=100)
    array([[100, 100, 100, 100, 100, 100, 100],
           [100, 100, 100, 100, 100, 100, 100],
           [100, 100,   0,   1,   2, 100, 100],
           [100, 100,   3,   4,   5, 100, 100],
           [100, 100, 100, 100, 100, 100, 100],
           [100, 100, 100, 100, 100, 100, 100]])

None
(1, 5, 4)
[[0.275 0.437 0.958 0.833 0.339]
 [0.174 0.376 0.    0.253 0.81 ]
 [0.01  0.608 0.613 0.102 0.386]
 [0.032 0.907 1.    0.056 0.907]
 [0.586 0.756 0.64  0.591 0.015]]
[19 14  0 13 12 10  3  6]
[ 3 15 10 15  3  9 16 11]
[ 3 10]


11-得到昨天、今天、明天的

yes = np.datetime64('today', 'D') - np.timedelta64(1, 'D')
tod = np.datetime64('today', 'D')
tom = np.datetime64('today', 'D') + np.timedelta64(1, 'D')
print(f"昨天是{yes}")
print(f"今天是{tod}")
print(f"明天是{tom}")

12-得到一个月中所有的天

tt = np.arange('2022-08', '2022-09', dtype='datetime64[D]')
print(tt)

13-得到一个数的整数部分

xs = np.random.uniform(0, 20, 8)
print(xs)

print(np.floor(xs))

14-构造一个数组,让它不能被改变–只读

# zz = np.zeros(5)
# zz.flags.writeable = False
# zz[0] = 2
# print(zz[0])

15-打印大数据的部分值

np.set_printoptions(threshold=5)
bq = np.zeros((20, 20))
print(bq)

相关程序运行如下:
昨天是2022-08-29
今天是2022-08-30
明天是2022-08-31
['2022-08-01' '2022-08-02' '2022-08-03' '2022-08-04' '2022-08-05'
 '2022-08-06' '2022-08-07' '2022-08-08' '2022-08-09' '2022-08-10'
 '2022-08-11' '2022-08-12' '2022-08-13' '2022-08-14' '2022-08-15'
 '2022-08-16' '2022-08-17' '2022-08-18' '2022-08-19' '2022-08-20'
 '2022-08-21' '2022-08-22' '2022-08-23' '2022-08-24' '2022-08-25'
 '2022-08-26' '2022-08-27' '2022-08-28' '2022-08-29' '2022-08-30'
 '2022-08-31']
[16.229 12.806 12.496  2.91  11.404  1.302  6.268  4.341]
[16. 12. 12.  2. 11.  1.  6.  4.]
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]



16-找到一个数组中,最接近一个数的索引

zd = np.arange(100)
vv = np.random.uniform(0, 100)
print(vv)

index = (np.abs(zd-vv)).argmin()
print(zd[index])

17-32位float类型和32位int类型转换

lx = np.arange(10, dtype=np.int32)
print(lx.dtype)
lx = lx.astype(np.float32)
print(lx.dtype)

18-打印数组元素位置坐标与数值

dy = np.arange(12).reshape(3, 4)
for i, val in np.ndenumerate(dy):
    print(i, val)

19-按照数组的某一列进行排序

px = np.random.randint(0, 10, (3, 3))
print(px)
print(px[px[:, 0].argsort()])

20-统计数组中每个数值出现的次数

cs = np.array([3, 5, 23, 5, 2, 5, 6, 7, 2, 3, 5])
print(np.bincount(cs))

相关程序运行如下:
52.69503887473037
53
int32
float32
(0, 0) 0
(0, 1) 1
(0, 2) 2
(0, 3) 3
(1, 0) 4
(1, 1) 5
(1, 2) 6
(1, 3) 7
(2, 0) 8
(2, 1) 9
(2, 2) 10
(2, 3) 11
[[6 0 7]
 [2 3 5]
 [4 2 4]]
[[2 3 5]
 [4 2 4]
 [6 0 7]]
[0 0 2 ... 0 0 1]



21-如何对一个四维数组的最后两维求和

szzz = np.random.randint(0, 10, [4, 4, 4, 4])
qh = szzz.sum(axis=(-2, -1))
print(qh)

22-交换矩阵中的两行

sz = np.arange(16).reshape(4, 4)
sz[[0, 1]] = sz[[1, 0]]
print(sz)

23-找到一个数组中最常出现的数字

sz = np.random.randint(0, 20, 20)
print(np.bincount(sz).argmax())

24-快速查找TOP K

sz = np.arange(1000)
np.random.shuffle(sz)
x = 66
print(sz[np.argpartition(-sz, x)[:x]])

25-去除掉一个数组中所有元素都相同的数据

np.set_printoptions(threshold=6)
sz = np.random.randint(0, 5, (10, 3))
print(sz)

sj = np.all(sz[:, 1:] == sz[:, :-1], axis=1)
print(sj)

sj2 = np.any(sz[:, 1:] == sz[:, :-1], axis=1)
print(sj2)

相关程序运行如下:
[[81 81 71 54]
 [78 60 38 63]
 [63 81 74 80]
 [67 58 69 76]]
[[ 4  5  6  7]
 [ 0  1  2  3]
 [ 8  9 10 11]


![img](https://img-blog.csdnimg.cn/img_convert/6ca21d6b74d1207b07cf0ee4671f5b61.png)
![img](https://img-blog.csdnimg.cn/img_convert/1aaf89259e20a667ccf5cbaaa617dd39.png)
![img](https://img-blog.csdnimg.cn/img_convert/6d094f408305226772988d40a2256689.png)

**既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!**

**由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新**

**[需要这份系统化资料的朋友,可以戳这里获取](https://bbs.csdn.net/topics/618545628)**

都相同的数据



np.set_printoptions(threshold=6)
sz = np.random.randint(0, 5, (10, 3))
print(sz)

sj = np.all(sz[:, 1:] == sz[:, :-1], axis=1)
print(sj)

sj2 = np.any(sz[:, 1:] == sz[:, :-1], axis=1)
print(sj2)


#### 相关程序运行如下:



[[81 81 71 54]
[78 60 38 63]
[63 81 74 80]
[67 58 69 76]]
[[ 4 5 6 7]
[ 0 1 2 3]
[ 8 9 10 11]

[外链图片转存中…(img-wVQRVgn8-1715040201774)]
[外链图片转存中…(img-yZNQyMBx-1715040201775)]
[外链图片转存中…(img-s2IVPhl9-1715040201775)]

既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!

由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新

需要这份系统化资料的朋友,可以戳这里获取

  • 10
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值