numpy基础

最新推荐文章于 2023-08-16 08:16:11 发布

撕破伤口

最新推荐文章于 2023-08-16 08:16:11 发布

阅读量172

点赞数 1

文章标签： python numpy 开发语言

本文链接：https://blog.csdn.net/m0_49643291/article/details/127046748

版权

numpy：

方法：

np.array() 将list转换为ndarray

np.asarray()将list转换为ndarray

np.ones()生成指定形状（shape参数）的全1数组默认float64 （dtype参数可以指定）

np.zero()生成指定形状的全0数组默认float64

np.empty()生成指定形状的空数组

np.identity()生成指定维数的单位阵

np.reshape()矩阵变形

np.transpose([0,1,2,…])矩阵转置

axis轴参数0纵1横

np.concatenate()级联

np.split(name_of_Matrix,indices_or_sections = [position_of_cutting,–,—]，axis=)

np.hsplit() 纵

np.vsplit()横

np.copy()创建副本

属性：

np.ndim查看数组维度

np.shape查看数组形状

np.dtype查看元素类型（默认相同）不同自动转

pandas：(pd)：

方法：

一、数据读取：

.cdv file : pd.read_csv(.csvfile_Path)

.txt file : pd.read_csv(.txt_Path,sep(分隔符) = ’ ‘,header(标题行) = ‘’/None,names(属性名/列名) = [’‘,’‘,’'])

.xlsx(excel) : pd.read_excel(.xlsxfile_Path)

二、数据结构：

Series : values+index 类一维数组

创建Series:

pd.Series([content],index = [index - index.num = content.num])

pd.Series({‘key(index)’:value,_,,~})

S1[‘index’,~] == value,~

Series 的索引与切片：

(1):显示索引: -index；-.loc[[index区间]] 闭区间

(2):隐式索引: -整数; -iloc 左闭右开

Dataframe: 表格型，可以看作Series组成的字典，类二维数组

行索引 index 列索引 columns 值 value

Dataframe 的创建:

Dataframe(Matrix,Index,Columns)

Dataframe(np.random.randint(0,150,(3,5)),index = list(‘abc’),columns = [‘py’,‘java’,‘C’,‘C++’,‘C#’])

Dataframe({coliumn:[value],~},index)

D1[‘columns’].astype()更改列的数据类型change data type of the column

三、pd数据查询

1、数值2、列表3、区间4、条件5、函数

方法：

更换index为其中一列：

object.set_index(‘indexname’,inplace = true)

更改某一列的数据以及类型：

object.loc[:,column_name] = object[‘column_name’].str.replace(‘改之前’,‘改之后’).astype(‘更改的数据类型’)

查询特定开头的index：

object.index.str.startswith(~)

去重：

object[‘column’].unique() return list

查询：

单个值：

object.loc[index,column] Return a value

object.loc[index,[columns]] Return a Series

object.loc[[indexes],column] Return a Series

object.loc[[indexes],[columns]] Return a Dataframe

区间查询：（闭区间）

object.loc[[index_str : index_end],[column_str : column_end]]

条件查询：object.loc[object[colunm] ? condition, : ]

函数查询：

匿名函数：object.loc[lambda object : (object[column] < condition ), : ]

自定义函数：

def query_my_data(object):

	return  object.index.str.startwith('~')&~

object.loc[qurey_my_data. : ]

四、pd新增数据列：

1、直接赋值

object.loc[ : . ‘new_column_name’ ] = ?

2、apply

apply(function,axis)

#定义一个函数，获取对应温度的类型
def get_wendu_type(x):
    if x['bWendu'] > 33:
        return '高温'
    elif x['yWendu'] < -10:
        return '低温'
    return '常温'

df.loc[:,'wendu_type']=df.apply(get_wendu_type,axis=1)

3、assign

类似apply 但可以：一次新增多个列不用指定方向axis

4、分条件赋值

weather['wencha_type'] = ''
weather.loc[weather['bWendu']-weather['yWendu'] > 10,'wencha_type'] = '温差大'
weather.loc[weather['bWendu']-weather['yWendu'] <= 10,'wencha_type'] = '温差正常'

~ = ‘’ ~ ~

五、丢失数据的处理

方法：

isnull(): nan return Ture

notnull(): nan return False

注 : .any() 一个True 都为True .all()都为True才为True

两个函数均可以在参数中制定axis 来获得行/列的情况

drop(name，axis): 删除默认行（axis为0）3x3 _- 2x3

行/列移除，不是为空

dropna(): 删除有nan的 axis

fillna(): 填充 return a full Dataframe

填充平均值： object.fillna(object.mean())

填充指定值：object.fillna(value)

method 参数: 指定填充方式（bfill以右列/下行填充 ffill 以左列/上行填充）

六、分组和合并：

group_by_column’s name = object.groupby('column’s name ')

例：group_by_column’s name.sum()：得到Name为index的各个整数列和

原Dataframe

0 laowang 2016 10000 3000 1
1 laosong 2016 2000 1000 2
2 laosong 2016 4000 1000 3
3 rongmei 2016 5000 1200 4
4 laowang 2017 18000 4000 5
5 laowang 2017 25000 2300 6
6 laosong 2017 3000 500 7
7 laowang 2017 4000 1000 8
print(Group_by_name.sum())
Year Salary Bonus
Name
laosong 6049 9000 2500
laowang 8067 57000 10300
rongmei 2016 5000 1200

group_by_column’s name.aggregate(‘sum’,np.mean,np.std标准差) = .agg

七、表格匹配与拼接

级联：concat() append()
合并：merge() join()

concat：

column_same index_different : concat

step1 = define a list[df1,df2,~~~ ]

step2 = concat(list , sort = False)

keys 参数:给列表中的Dataframe标记

append():

merge():

可以添加条件

like join in SQL

df1.merge(df4, on='city')

apts cars city
0 55000 200000 Shanghai
1 60000 300000 Beijing
2 58000 250000 Shenzhen
salaries city
0 10000 Suzhou
1 30000 Beijing
2 30000 Shanghai
3 20000 Guangzhou
4 15000 Tianjin
apts cars city salaries
0 55000 200000 Shanghai 30000
1 60000 300000 Beijing 30000