import numpy as np
1.创建一个长度为10的空向量
a=np.zeros(10)
print(a)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
2.第五个值设置为1
a[4]=1
print(a)
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
3.创建一个值域范围从10到49的向量
b=np.arange(10,50)
print(b)
[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49]
4.反转一个向量(第一个元素变为最后一个)
b=b[::-1]
print(b)
[49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10]
5.创建一个长度为30的随机向量并找到它的平均值
c=np.random.random(30)
d=c.mean()
print(d)
0.5527580659886581
6.创建一个10*10的随机数组并找出它的最大值和最小值
e=np.random.random((10,10))
print(e)
[[0.9170069 0.66568986 0.56742636 0.14696659 0.93767453 0.09402097
0.54618329 0.12625947 0.90182726 0.61660999]
[0.3486533 0.98538757 0.68982081 0.33552572 0.24356616 0.1594021
0.40305814 0.0639056 0.67879594 0.57891074]
[0.60909009 0.99302626 0.54645758 0.73081934 0.45984285 0.82253375
0.35515268 0.57644458 0.2955276 0.18391424]
[0.24158195 0.49485468 0.59128442 0.55743579 0.99961246 0.4508759
0.74789393 0.02750277 0.98453766 0.01303206]
[0.0033238 0.88869616 0.14164172 0.85822794 0.58211973 0.56087155
0.77262901 0.24472144 0.96130194 0.62989183]
[0.07068191 0.40728846 0.47576458 0.44355222 0.10916397 0.64966196
0.49579891 0.17922588 0.87968783 0.69344045]
[0.23523426 0.02701388 0.22272702 0.72441141 0.84543185 0.57887048
0.58257562 0.24566879 0.27529409 0.99546806]
[0.81835016 0.53123186 0.0743116 0.23961043 0.95036313 0.42512372
0.12252098 0.10007916 0.3101578 0.14821808]
[0.03880317 0.07812695 0.835528 0.68364065 0.34818947 0.13560458
0.60249626 0.25700171 0.70802306 0.73026684]
[0.27389633 0.17089138 0.12786965 0.78076045 0.99042567 0.98327677
0.87483383 0.55379135 0.56356262 0.14683745]]
min,max=e.min(),e.max()
print(min,max)
0.0033238021751439417 0.999612459219021
7.对一个5*5的随机矩阵做归一化
f=np.random.random((5,5))
max,min=f.max(),f.min()
f=(f-min)/(max-min)
print(f)
[[0.79651043 0.77650729 0.68965856 0.0518204 0.81799265]
[0.9568898 0.77565194 0.44387402 0.71809914 0.70544833]
[0. 0.63012163 0.28662836 0.7276584 0.91605213]
[0.65708275 0.86770802 0.53851921 1. 0.30971719]
[0.49909808 0.25904126 0.20944617 0.28601671 0.99925076]]
8.如何得到所有与2016年7月对应的日期
g=np.arange('2016-07','2016-08',dtype='datetime64[D]')
print(g)
['2016-07-01' '2016-07-02' '2016-07-03' '2016-07-04' '2016-07-05'
'2016-07-06' '2016-07-07' '2016-07-08' '2016-07-09' '2016-07-10'
'2016-07-11' '2016-07-12' '2016-07-13' '2016-07-14' '2016-07-15'
'2016-07-16' '2016-07-17' '2016-07-18' '2016-07-19' '2016-07-20'
'2016-07-21' '2016-07-22' '2016-07-23' '2016-07-24' '2016-07-25'
'2016-07-26' '2016-07-27' '2016-07-28' '2016-07-29' '2016-07-30'
'2016-07-31']
9.检查一个二维数组是否有空列
h=np.random.randint(0,10,(3,10))
print(h)
[[5 2 3 1 4 3 7 0 0 4]
[1 2 0 9 4 7 4 6 3 4]
[3 5 0 1 7 1 4 1 7 4]]
print(h.any())
True
10.以给定的形状创建一个数组,数组元素来符合标准正态分布N(0,1)
i=np.random.randn(4,6)
print(i)
[[-0.06696173 -0.397078 -0.54527181 1.40846477 0.38308806 1.39167608]
[ 0.43594496 -0.90587903 0.95744047 0.97402486 -1.98892672 -0.32113629]
[-0.80864666 -1.42289961 0.11787861 -1.39319269 0.0828501 0.04520828]
[ 0.99880749 -1.97154802 -1.24288586 0.22087789 0.12689615 0.33137178]]
11.以给定的形状创建一个数组,并在数组中加入在[0,1]之间均匀分布的随机样本
j=np.random.rand(4,6)
print(j)
[[0.91697975 0.65660303 0.96028603 0.9025543 0.13344065 0.24094464]
[0.09234047 0.84434315 0.25852919 0.04614204 0.52459626 0.16026817]
[0.26738904 0.15655722 0.89825181 0.51723048 0.72831946 0.02677522]
[0.71228968 0.19691786 0.19664906 0.66680613 0.44638245 0.5629584 ]]
12.生成在半开半闭区间[low,high)上离散均匀分布的整数值
k=np.random.randint(0,5,(4,6))
print(k)
[[0 2 0 4 4 2]
[1 0 0 1 3 0]
[3 1 2 4 2 4]
[4 1 3 3 3 2]]
import pandas as pd
13.从列表创建Series
l=[0,1,2,3,4]
df=pd.Series(l)
df
0 0
1 1
2 2
3 3
4 4
dtype: int64
14.从字典创建Series
m={'a':1,'b':2,'c':3,'d':4,'e':5}
df=pd.Series(m)
df
a 1
b 2
c 3
d 4
e 5
dtype: int64
15.从Numpy数组创建DataFrame
dates=pd.date_range('today',periods=6)
num_arr=np.random.rand(6,4)
columns=['A','B','C','D']
df=pd.DataFrame(num_arr,index=dates,columns=columns)
df
16.从字典对象创建DataFrame,并设置索引
data={'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],
'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
df = pd.DataFrame(data, index=labels)
df
17.显示df的基础信息
Index: 10 entries, a to j
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 animal 10 non-null object
1 age 8 non-null float64
2 visits 10 non-null int64
3 priority 10 non-null object
dtypes: float64(1), int64(1), object(2)
memory usage: 400.0+ bytes
(2)df.describe()
18.展示df的前3行
(1)df.head(3)
(2)df.iloc[:3]
19.取出df的animal和age列
(1)df[['animal','age']]
(2)df.loc[:,['animal','age']]
20.取出索引为[3,4,8]行的animal和age列
df.loc[df.index[[3,4,8]],['animal','age']]
21.取出age值大于3的行
df[df['age']>3]
22.取出age值缺失的行
df[df['age'].isnull()]
23.取出age在2,4间的行(不含)
df[df['age'].between(2,4)]
24.f行的age改为1.5
df.loc['f','age']=1.5
25.计算visits的总和
df['visits'].sum()
19
26.计算每个不同种类animal的age的平均数
df.groupby('animal')['age'].mean()
animal
cat 2.333333
dog 5.000000
snake 2.500000
Name: age, dtype: float64
27.计算df中每个种类animal的数量
df['animal'].value_counts()
dog 4
cat 4
snake 2
Name: animal, dtype: int64
28.先按age降序排列,后按visits升序排列
df.sort_values(by=['age','visits'],ascending=[False,True])