Pandas 数据分析笔记(一)

最新推荐文章于 2022-06-28 16:59:26 发布

洛丹伦全境守护者

最新推荐文章于 2022-06-28 16:59:26 发布

阅读量219

点赞数

分类专栏：学习笔记文章标签： Pandas

本文链接：https://blog.csdn.net/JeallyBean/article/details/90232766

版权

学习笔记专栏收录该内容

10 篇文章 1 订阅

订阅专栏

一、DataFrame数据操作

1.新增列数据

添加一个索引列，对该索引下的数据进行赋值
增加通过cloumn[‘name’] = …
列数要相同

演示代码:

import pandas as pd
import numpy as np

# names标签可以给没有列名的数据添加列名
data = pd.read_csv("D:/DataSet/Demo/iris.csv",encoding="utf-8",names=['Sep_len','Sep_wid','Pet_len','Pet_wid','Iris_type'])
print(data.head())
print("==================================")
data['Plus_all'] = data['Sep_len']+data['Sep_wid']

print(data.head())

   Sep_len  Sep_wid  Pet_len  Pet_wid    Iris_type
0      5.1      3.5      1.4      0.2  Iris-setosa
1      4.9      3.0      1.4      0.2  Iris-setosa
2      4.7      3.2      1.3      0.2  Iris-setosa
3      4.6      3.1      1.5      0.2  Iris-setosa
4      5.0      3.6      1.4      0.2  Iris-setosa
==================================
   Sep_len  Sep_wid  Pet_len  Pet_wid    Iris_type  Plus_all
0      5.1      3.5      1.4      0.2  Iris-setosa       8.6
1      4.9      3.0      1.4      0.2  Iris-setosa       7.9
2      4.7      3.2      1.3      0.2  Iris-setosa       7.9
3      4.6      3.1      1.5      0.2  Iris-setosa       7.7
4      5.0      3.6      1.4      0.2  Iris-setosa       8.6

这里要特别注意添加的列数要和原本表数据的列数相同,如果手动增加数据,出现不相同的情况会出现错误:Length of values does not match length of index

2.通过赋值增加列数据

所有赋的值都一样

紧接上面的代码块

data['Status'] = 'True'
print(data.head())

    ID  Sep_len  Sep_wid  Pet_len      Pet_wid  Iris_type  Plus_all Status
0  5.1      3.5      1.4      0.2  Iris-setosa        NaN       4.9   True
1  4.9      3.0      1.4      0.2  Iris-setosa        NaN       4.4   True
2  4.7      3.2      1.3      0.2  Iris-setosa        NaN       4.5   True
3  4.6      3.1      1.5      0.2  Iris-setosa        NaN       4.6   True
4  5.0      3.6      1.4      0.2  Iris-setosa        NaN       5.0   True

3.通过loc切片指定位置增加行

loc 为根据索引名称或条件镜像切片,用法类似:loc[index_name,index_name],这里特别注意是索引名称不是索引位置

print(data.shape)
new_info = pd.DataFrame({'Sep_len': 6.6,
                        'Sep_wid': 6.6,
                        'Pet_len': 6.6,
                        'Pet_wid': 6.6,
                         'Plus_all':6.6,
                        'Iris_type': 'Test-Iris',
                         'Status':'False'}
                        ,index=[1])
above = data.loc[:148]
below = data.loc[149:]
newdata = above.append(new_info,ignore_index=True).append(below,ignore_index=True)
# data = data.append(new_info,ignore_index=True)

print(newdata.tail())
print(newdata.shape)

(150, 7)
          Iris_type  Pet_len  Pet_wid  Plus_all  Sep_len  Sep_wid Status
146  Iris-virginica      5.0      1.9       8.8      6.3      2.5   True
147  Iris-virginica      5.2      2.0       9.5      6.5      3.0   True
148  Iris-virginica      5.4      2.3       9.6      6.2      3.4   True
149       Test-Iris      6.6      6.6       6.6      6.6      6.6  False
150  Iris-virginica      5.1      1.8       8.9      5.9      3.0   True
(151, 7)