先自我介绍一下,小编浙江大学毕业,去过华为、字节跳动等大厂,目前阿里P7
深知大多数程序员,想要提升技能,往往是自己摸索成长,但自己不成体系的自学效果低效又漫长,而且极易碰到天花板技术停滞不前!
因此收集整理了一份《2024年最新大数据全套学习资料》,初衷也很简单,就是希望能够帮助到想自学提升又不知道该从何学起的朋友。
既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!
由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新
如果你需要这些资料,可以添加V获取:vip204888 (备注大数据)
正文
- [8、不要做别人情绪的垃圾桶,别人向你抱怨时请保持安静。记住,你不用讨好任何一个人。](#8_887)
- [9、比起被人喜欢,你的尊严和你的原则更加重要。他们是你作为一个独立完整的人格的验证,不容侵犯。](#9_888)
- [10、学会拒绝,做好自己分内的事,责任分工要明确。](#10_889)
+ [每日一言:](#_895)
+ - [持续更新中...](#_903)
个人昵称:lxw-pro
个人主页:欢迎关注 我的主页
个人感悟: “失败乃成功之母”,这是不变的道理,在失败中总结,在失败中成长,才能成为IT界的一代宗师。
import numpy as np # 导入Numpy库
四则运算:
x = np.array([3, 5])
y = np.array([6, 2])
# 列相乘
xc = np.multiply(x, y)
print(xc)
# 列乘后相加
qxc = np.dot(x, y)
print(qxc)
print(x.shape)
print(y.shape)
# 一维与二维相乘
x = np.array([2, 3, 4])
y = np.array([
[1, 2, 3],
[2, 3, 4]
])
print(x * y)
# 辨别x和y2是否一样
y2 = np.array([2, 4, 9])
print(x == y2)
# 与
yy = np.logical_and(x, y2)
print(yy)
# 或
hh = np.logical_or(x, y2)
print(hh)
# 非
ff = np.logical_not(x, y2)
print(ff)
相关程序运行如下:
[18 10]
28
(2,)
(2,)
[[ 2 6 12]
[ 4 9 16]]
[ True False False]
[ True True True]
[ True True True]
[0 0 0]
print()
随机模块:
sj = np.random.rand(2, 6) # 所有的值都是0从1
print(sj)
yx = np.random.randint(8, size=(5, 3)) # 返回的是随机的整数,左闭右开
print(yx)
# 随机数
s = np.random.rand()
print(s)
# 随机样本
yb = np.random.random_sample()
print(yb)
# 区间内的随机数
qjs = np.random.randint(0, 10, 6)
print(qjs)
# 高斯分布
mu, sigma = 0, 0.1
fb = np.random. normal(mu, sigma, 8)
print(fb)
# 指定精度
zd = np.set_printoptions(precision=3)
print(fb)
# 洗牌
xps = np.arange(10)
np.random.shuffle(xps)
print(xps)
# 随机的种子
np.random.seed(100)
mu, sigma = 0, 0.1
z = np.random.normal(mu, sigma, 8)
print(z)
相关程序运行如下:
[[0.63334441 0.85097104 0.59019264 0.310542 0.90493224 0.64755 ]
[0.26229661 0.22710308 0.8936011 0.42837496 0.06484865 0.01209753]]
[[3 5 4]
[6 4 0]
[5 3 5]
[4 2 7]
[2 0 3]]
0.5814122350900927
0.37162507133518075
[1 0 1 4 6 2]
[ 0.04351687 -0.02026214 0.02332794 -0.09842403 0.06876269 0.02239188
-0.06339656 0.11343825]
[ 0.044 -0.02 0.023 -0.098 0.069 0.022 -0.063 0.113]
[6 2 4 3 7 0 1 5 8 9]
[-0.175 0.034 0.115 -0.025 0.098 0.051 0.022 -0.107]
print()
文件读写:
data = []
with open('np2.txt') as f:
for line in f:
fil = line.split()
f_data = [float(i) for i in fil]
data.append(f_data)
data = np.array(data)
print(data)
# 法二--简便
# delimiter 分隔符 | skiprows=1 去掉几行 | usecols = (0, 1, 4) 指定使用哪几列
data = np.loadtxt('np2.txt', delimiter=' ', skiprows=1)
print(data)
相关程序运行如下:
[[1. 2. 3. 4. 5. 6.]
[4. 5. 6. 7. 8. 9.]]
[4. 5. 6. 7. 8. 9.]
print()
数组保存:
xr = np.array([
[1, 2, 3],
[6, 7, 8]
])
np.savetxt('np2_1.txt', xr)
np.savetxt('np2_2.txt', xr, fmt='%d')
np.savetxt('np2_3.txt', xr, fmt='%d', delimiter=',')
np.savetxt('np2_4.txt', xr, fmt='%.2f', delimiter=' ')
# 读写array结构
dx_array = np.array([
[5, 2, 0],
[1, 4, 9]
])
np.save('np2_1.npy', dx_array)
dx = np.load('np2_1.npy')
print(dx)
相关程序运行如下:
[[5 2 0]
[1 4 9]]
Numpy练习题:
import numpy as np # 导入Numpy库
1-打印当前Numpy版本
print(np.__version__)
2-构造一个全零的矩阵,并打印其占用的内存大小
ojz = np.zeros((5, 5))
print(ojz)
print("%d bytes" % (ojz.size*ojz.itemsize))
3-打印一个函数的帮助文档,比如numpy.add
bz = help(np.info(np.add))
print(bz)
4-创建一个2~20的数组,并将其逆序
sz = np.arange(2, 21, 1)
print(sz)
sz = sz[::-1]
print(sz)
5-找到一个数组中不为0的索引
sy = np.nonzero([2, 53, 12, 43, 0, 0, 0, 23, 90])
print(sy)
相关程序运行如下:
1.22.3
[[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
200 bytes
add(x1, x2, /, out=None, \*, where=True, casting='same\_kind', order='K', dtype=None, subok=True[, signature, extobj])
Add arguments element-wise.
Parameters
----------
x1, x2 : array_like
The arrays to be added.
If ``x1.shape != x2.shape``, they must be broadcastable to a common
shape (which becomes the shape of the output).
out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If provided, it must have
a shape that the inputs broadcast to. If not provided or None,
a freshly-allocated array is returned. A tuple (possible only as a
keyword argument) must have length equal to the number of outputs.
where : array_like, optional
This condition is broadcast over the input. At locations where the
condition is True, the `out` array will be set to the ufunc result.
Elsewhere, the `out` array will retain its original value.
Note that if an uninitialized `out` array is created via the default
``out=None``, locations within it where the condition is False will
remain uninitialized.
\*\*kwargs
For other keyword-only arguments, see the
:ref:`ufunc docs <ufuncs.kwargs>`.
Returns
-------
add : ndarray or scalar
The sum of `x1` and `x2`, element-wise.
This is a scalar if both `x1` and `x2` are scalars.
Notes
-----
Equivalent to `x1` + `x2` in terms of array broadcasting.
Examples
--------
>>> np.add(1.0, 4.0)
5.0
>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> np.add(x1, x2)
array([[ 0., 2., 4.],
[ 3., 5., 7.],
[ 6., 8., 10.]])
The ``+`` operator can be used as a shorthand for ``np.add`` on ndarrays.
>>> x1 = np.arange(9.0).reshape((3, 3))
>>> x2 = np.arange(3.0)
>>> x1 + x2
array([[ 0., 2., 4.],
[ 3., 5., 7.],
[ 6., 8., 10.]])
Help on NoneType object:
class NoneType(object)
| Methods defined here:
|
| \_\_bool\_\_(self, /)
| self != 0
|
| \_\_repr\_\_(self, /)
| Return repr(self).
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| \_\_new\_\_(\*args, \*\*kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
None
[ 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
[20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2]
(array([0, 1, 2, 3, 7, 8], dtype=int32),)
6-随机构造一个3*3矩阵,并打印其中最大与最小值
zz = np.random.random((3, 3))
print(zz.max())
print(zz.min())
7-构造一个5*5的矩阵,令其值都为1,并在最外层加上一圈0
jz = np.ones((5, 5))
jz = np.pad(jz, pad_width=1, mode='constant', constant_values=0)
print(jz)
print(help(np.pad)) # 帮助文档
8-构造一个shape为(6, 7, 8)的矩阵,并找到第100个元素的索引值
sy8 = np.unravel_index(100, (6, 7, 8))
print(sy8)
9-对一个5*5的矩阵做归一化操作
cz = np.random.random((5, 5))
cz_max = cz.max()
cz_min = cz.min()
cz = (cz-cz_min)/(cz_max-cz_min)
print(cz)
10-找到两个数组中相同的值
sz1 = np.random.randint(0, 20, 8)
sz2 = np.random.randint(0, 20, 8)
print(sz1)
print(sz2)
print(np.intersect1d(sz1, sz2))
相关程序运行如下:
0.9786237847073697
0.10837689046425514
[[0. 0. 0. 0. 0. 0. 0.]
[0. 1. 1. 1. 1. 1. 0.]
[0. 1. 1. 1. 1. 1. 0.]
[0. 1. 1. 1. 1. 1. 0.]
[0. 1. 1. 1. 1. 1. 0.]
[0. 1. 1. 1. 1. 1. 0.]
[0. 0. 0. 0. 0. 0. 0.]]
Help on function pad in module numpy:
pad(array, pad_width, mode='constant', \*\*kwargs)
Pad an array.
Parameters
----------
array : array_like of rank N
The array to pad.
pad_width : {sequence, array_like, int}
Number of values padded to the edges of each axis.
((before_1, after_1), ... (before_N, after_N)) unique pad widths
for each axis.
((before, after),) yields same before and after pad for each axis.
(pad,) or int is a shortcut for before = after = pad width for all
axes.
mode : str or function, optional
One of the following string values or a user supplied function.
'constant' (default)
Pads with a constant value.
'edge'
Pads with the edge values of array.
'linear\_ramp'
Pads with the linear ramp between end_value and the
array edge value.
'maximum'
Pads with the maximum value of all or part of the
vector along each axis.
'mean'
Pads with the mean value of all or part of the
vector along each axis.
'median'
Pads with the median value of all or part of the
vector along each axis.
'minimum'
Pads with the minimum value of all or part of the
vector along each axis.
'reflect'
Pads with the reflection of the vector mirrored on
the first and last values of the vector along each
axis.
'symmetric'
Pads with the reflection of the vector mirrored
along the edge of the array.
'wrap'
Pads with the wrap of the vector along the axis.
The first values are used to pad the end and the
end values are used to pad the beginning.
'empty'
Pads with undefined values.
.. versionadded:: 1.17
<function>
Padding function, see Notes.
stat_length : sequence or int, optional
Used in 'maximum', 'mean', 'median', and 'minimum'. Number of
values at edge of each axis used to calculate the statistic value.
((before_1, after_1), ... (before_N, after_N)) unique statistic
lengths for each axis.
((before, after),) yields same before and after statistic lengths
for each axis.
(stat_length,) or int is a shortcut for before = after = statistic
length for all axes.
Default is ``None``, to use the entire axis.
constant_values : sequence or scalar, optional
Used in 'constant'. The values to set the padded values for each
axis.
``((before_1, after_1), ... (before_N, after_N))`` unique pad constants
for each axis.
``((before, after),)`` yields same before and after constants for each
axis.
``(constant,)`` or ``constant`` is a shortcut for ``before = after = constant`` for
all axes.
Default is 0.
end_values : sequence or scalar, optional
Used in 'linear\_ramp'. The values used for the ending value of the
linear_ramp and that will form the edge of the padded array.
``((before_1, after_1), ... (before_N, after_N))`` unique end values
for each axis.
``((before, after),)`` yields same before and after end values for each
axis.
``(constant,)`` or ``constant`` is a shortcut for ``before = after = constant`` for
all axes.
Default is 0.
reflect_type : {'even', 'odd'}, optional
Used in 'reflect', and 'symmetric'. The 'even' style is the
default with an unaltered reflection around the edge value. For
the 'odd' style, the extended part of the array is created by
subtracting the reflected values from two times the edge value.
Returns
-------
pad : ndarray
Padded array of rank equal to `array` with shape increased
according to `pad_width`.
Notes
-----
.. versionadded:: 1.7.0
For an array with rank greater than 1, some of the padding of later
axes is calculated from padding of previous axes. This is easiest to
think about with a rank 2 array where the corners of the padded array
are calculated by using padded values from the first axis.
The padding function, if used, should modify a rank 1 array in-place. It
has the following signature::
padding\_func(vector, iaxis_pad_width, iaxis, kwargs)
where
vector : ndarray
A rank 1 array already padded with zeros. Padded values are
vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].
iaxis_pad_width : tuple
A 2-tuple of ints, iaxis_pad_width[0] represents the number of
values padded at the beginning of vector where
iaxis_pad_width[1] represents the number of values padded at
the end of vector.
iaxis : int
The axis currently being calculated.
kwargs : dict
Any keyword arguments the function requires.
Examples
--------
>>> a = [1, 2, 3, 4, 5]
>>> np.pad(a, (2, 3), 'constant', constant_values=(4, 6))
array([4, 4, 1, ..., 6, 6, 6])
>>> np.pad(a, (2, 3), 'edge')
array([1, 1, 1, ..., 5, 5, 5])
>>> np.pad(a, (2, 3), 'linear\_ramp', end_values=(5, -4))
array([ 5, 3, 1, 2, 3, 4, 5, 2, -1, -4])
>>> np.pad(a, (2,), 'maximum')
array([5, 5, 1, 2, 3, 4, 5, 5, 5])
>>> np.pad(a, (2,), 'mean')
array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> np.pad(a, (2,), 'median')
array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> a = [[1, 2], [3, 4]]
>>> np.pad(a, ((3, 2), (2, 3)), 'minimum')
array([[1, 1, 1, 2, 1, 1, 1],
[1, 1, 1, 2, 1, 1, 1],
[1, 1, 1, 2, 1, 1, 1],
[1, 1, 1, 2, 1, 1, 1],
[3, 3, 3, 4, 3, 3, 3],
[1, 1, 1, 2, 1, 1, 1],
[1, 1, 1, 2, 1, 1, 1]])
>>> a = [1, 2, 3, 4, 5]
>>> np.pad(a, (2, 3), 'reflect')
array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])
>>> np.pad(a, (2, 3), 'reflect', reflect_type='odd')
array([-1, 0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> np.pad(a, (2, 3), 'symmetric')
array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])
>>> np.pad(a, (2, 3), 'symmetric', reflect_type='odd')
array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])
**网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。**
**需要这份系统化的资料的朋友,可以添加V获取:vip204888 (备注大数据)**
![img](https://img-blog.csdnimg.cn/img_convert/e674441900493f178ba28f28959f6cb3.png)
**一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!**
[1, 1, 1, 2, 1, 1, 1]])
>>> a = [1, 2, 3, 4, 5]
>>> np.pad(a, (2, 3), 'reflect')
array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])
>>> np.pad(a, (2, 3), 'reflect', reflect_type='odd')
array([-1, 0, 1, 2, 3, 4, 5, 6, 7, 8])
>>> np.pad(a, (2, 3), 'symmetric')
array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])
>>> np.pad(a, (2, 3), 'symmetric', reflect_type='odd')
array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])
**网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。**
**需要这份系统化的资料的朋友,可以添加V获取:vip204888 (备注大数据)**
[外链图片转存中...(img-3q5c8Cbb-1713360365797)]
**一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!**