R语言 RStudio快捷键

最新推荐文章于 2024-07-02 10:24:12 发布

LT-CAT努力前进

最新推荐文章于 2024-07-02 10:24:12 发布

阅读量1.3k

点赞数 1

分类专栏： python算法文章标签： python

python算法专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Pandas读取数据到Dataframe
Python中用Pandas进行数据分析,最常用的就是Dataframe数据结构，之前写过一篇文章介绍Pandas的基本用法，后来有些朋友问Pandas怎么从数据库中读取数据，怎么从文件中读取数据之类的问题，因此单独开篇文章介绍Pandas如何读取数据到Dataframe。

Pandas读取Mysql数据要读取Mysql中的数据，首先要安装Mysqldb包。假设我数据库安装在本地，用户名位myusername,密码为mypassword,要读取mydb数据库中的数据，那么对应的代码如下：

import pandas as pd
import MySQLdb
mysql_cn= MySQLdb.connect(host='localhost', port=3306,user='myusername', passwd='mypassword', db='mydb')
df = pd.read_sql('select * from test;', con=mysql_cn)    
mysql_cn.close()

上面的代码读取了test表中所有的数据到df中，而df的数据结构为Dataframe。
2. Pandas读取csv文件数据Pandas读取csv文件中的数据要简单的多，不用额外安装程序包，假设我们要读取test.csv中的数据, 对应的代码如下:
df = pd.read_csv(loggerfile, header=None, sep=',')
header=None表示没有头部，sep=’,’表示字段之间的分隔符为逗号。
pandas.DataFrame 的操作简单经验（创建，索引，增添，删除）

在网上搜过许多关于pandas.DataFrame的操作说明，都是一些基础的操作，但是这些操作组合起来还是比较费时间去正确操作DataFrame，花了我挺长时间去调整BUG的。我在这里做一些总结，方便你我他。

一创建DataFrame的简单操作：
1.根据字典创造：

In [1]: import pandas as pd
In [3]: aa={'one':[1,2,3],'two':[2,3,4],'three':[3,4,5]}
In [4]: bb=pd.DataFrame(aa)
In [5]: bb
Out[5]: 
   one  three  two
0    1      3    2
1    2      4    3
2    3      5    4`

字典中的keys就是DataFrame里面的columns，但是没有index的值，所以需要自己设定，不设定默认是从零开始计数

bb=pd.DataFrame(aa,index=['first','second','third'])
bb
Out[7]: 
        one  three  two
first     1      3    2
second    2      4    3
third     3      5    4

2.从多维数组中创建

import numpy as np
In [9]: del aa
In [10]: aa=np.array([[1,2,3],[4,5,6],[7,8,9]])
In [11]: aa
Out[11]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])
In [12]: bb=pd.DataFrame(aa)
In [13]: bb
Out[13]: 
   0  1  2
0  1  2  3
1  4  5  6
2  7  8  9

从多维数组中创建就需要为DataFrame赋值columns和index，否则就是默认的，很丑的。

bb=pd.DataFrame(aa,index=[22,33,44],columns=['one','two','three'])
In [15]: bb
Out[15]: 
    one  two  three
22    1    2      3
33    4    5      6
44    7    8      9

3.用其他的DataFrame创建

bb=pd.DataFrame(aa,index=[22,33,44],columns=['one','two','three'])
bb
Out[15]: 
    one  two  three
22    1    2      3
33    4    5      6
44    7    8      9
cc=bb[['one','three']].copy()
Cc
Out[17]: 
    one  three
22    1      3
33    4      6
44    7      9

这里的拷贝是深拷贝，改变cc中的值并不能改变bb中的值。

cc['three'][22]=5
bb
Out[19]: 
    one  two  three
22    1    2      3
33    4    5      6
44    7    8      9

cc
Out[20]: 
    one  three
22    1      5
33    4      6
44    7      9

LT-CAT努力前进

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
R语言 RStudio快捷键

Pandas读取数据到Dataframe Python中用Pandas进行数据分析,最常用的就是Dataframe数据结构，之前写过一篇文章介绍Pandas的基本用法，后来有些朋友问Pandas怎么从数据库中读取数据，怎么从文件中读取数据之类的问题，因此单独开篇文章介绍Pandas如何读取数据到Dataframe。Pandas读取Mysql数据要读取Mysql中的数据，首先要安装Mysqldb包
复制链接

扫一扫