2.2 pd.读取与存储数据

 1、excel

data = pd.read_excel(io,
           sheetname=0, header=0, 
           skiprows=None, skip_footer=0, 
           index_col=None, names=None, 
           parse_cols=None, parse_dates=False,
           date_parser=None, na_values=None,
           thousands=None, convert_float=True, 
           has_index_names=None, converters=None,
           dtype=None, true_values=None, 
           false_values=None, engine=None,
           squeeze=False, **kwds)

常用参数解析:

io : string, path object ; excel 路径。
sheetname : string, int, mixed list of strings/ints, or None, default 0 返回多表使用sheetname=[0,1],若sheetname=None是返回全表 注意:int/string 返回的是dataframe,而none和list返回的是dict of dataframe
header : int, list of ints, default 0 指定列名行,默认0,即取第一行,数据为列名行以下的数据 若数据不含列名,则设定 header = None
skiprows : list-like,Rows to skip at the beginning,省略指定行数的数据
skip_footer : int,default 0, 省略从尾部数的int行数据
index_col : int, list of ints, default None指定列为索引列,也可以使用u”strings”
names : array-like, default None, 指定列的名字。

import numpy as np
import pandas as pd

io='C:/Users/23008/Desktop/AJPY/test.xls'
data = pd.read_excel(io)
#默认:第一个表sheet[0],列名为第一行header=0
print(type(data))
print(data.head())
print(data.tail())
<class 'pandas.core.frame.DataFrame'>
     a    131 wevw  1
0    b  132.0    4  2
1  svv  133.0   w3  3
2   sv  134.0   43  4
3   vs  135.0   r4  5
4  fsv  136.0  NaN  6
         a    131 wevw   1
19      bu  151.0   66  21
20     bye  152.0   67  22
21  yebrtu  153.0   68  23
22      ye  154.0   69  24
23      ey  155.0   70  25

 

 

data_1 = pd.read_excel(io,header=None,sheetname=1)
#第一行不做列名,读第二个工作表
print(data_1.head())
       0   1
0   dsaa  33
1    758  34
2     58  35
3  58568  36
4    143  37
data_2= pd.read_excel(io,skiprows=[4],skip_footer=3)
#skiprows=[4]省略第五行,=4省略前五行
#skip_footer=3省略后三行,=[3]倒数第三行
print(data_2.head())
print(data_2.tail())
        a    131 wevw  1
0       b  132.0    4  2
1     svv  133.0   w3  3
2      sv  134.0   43  4
3     fsv  136.0  NaN  6
4  svsvsv  137.0   52  7
         a    131 wevw   1
15  yebrtu  148.0   63  18
16     bey  149.0   64  19
17      b5    NaN   65  20
18      bu  151.0   66  21
19     bye  152.0   67  22
data_3= pd.read_excel(io,header=None,names=["a","b","c","e"])
#name参数,指定列名
print(data_3.head())
     a      b     c  e
0    a  131.0  wevw  1
1    b  132.0     4  2
2  svv  133.0    w3  3
3   sv  134.0    43  4
4   vs  135.0    r4  5

data_4= pd.read_excel(io,index_col="a")
#index_col="a",指定a列为索引列
print(data_4.head())
       131 wevw  1
a                 
b    132.0    4  2
svv  133.0   w3  3
sv   134.0   43  4
vs   135.0   r4  5
fsv  136.0  NaN  6

2、存储数据

to_excel(self, excel_writer, sheet_name='Sheet1', na_rep='', 
         float_format=None,columns=None, header=True, index=True, 
         index_label=None,startrow=0, startcol=0, engine=None, 
         merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None)

常用参数解析 
- excel_writer : string or ExcelWriter object 目标路径 
- sheet_name : string, default ‘Sheet1’ Name of sheet which will contain DataFrame,填充excel的第几页 
- na_rep : string, default ”,Missing data representation 缺失值填充 
- float_format : string, default None Format string for floating point numbers 
- columns : sequence, optional,Columns to write 选择输出的的列。 
- header : boolean or list of string, default True Write out column names. If a list of string is given it is assumed to be aliases for the column names 
- index : boolean, default True,Write row names (index) 
- index_label : string or sequence, default None, Column label for index column(s) if desired. If None is given, andheader and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex. 
- startrow :upper left cell row to dump data frame 
- startcol :upper left cell column to dump data frame 
- engine : string, default None ,write engine to use - you can also set this via the options,io.excel.xlsx.writer, io.excel.xls.writer, andio.excel.xlsm.writer. 
- merge_cells : boolean, default True Write MultiIndex and Hierarchical Rows as merged cells. 
- encoding: string, default None encoding of the resulting excel file. Only necessary for xlwt,other writers support unicode natively. 
- inf_rep : string, default ‘inf’ Representation for infinity (there is no native representation for infinity in Excel) 
- freeze_panes : tuple of integer (length 2), default None Specifies the one-based bottommost row and rightmost column that is to be frozen
 

io='C:/Users/23008/Desktop/AJPY/test.xls'
io1='C:/Users/23008/Desktop/AJPY/test11.xls'
data = pd.read_excel(io)
data.to_excel(io1,sheet_name="sheet2",na_rep="NULL",header=["a","b","c","d"])
#存储到test11第二个工作表中,缺失值填充为“NULL”,列名为abcd

io1='C:/Users/23008/Desktop/AJPY/test11.xls'
io2='C:/Users/23008/Desktop/AJPY/test12.xls'
data = pd.read_excel(io1)
data.to_excel(io2,columns=['b','c'],index_label=["ID"])
#columns=['b','c']:只存bc两列
#行索引默认有,index_label=["ID"],行索引列名为ID

 

 

data.to_excel(io2,columns=['b','c'],index=None)
#index=None不要行索引,默认有

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值