#pd.date_range生成时间序列>>>date = pd.date_range('20130102',periods =6)>>>date
Out[11]:
DatetimeIndex(['2013-01-02','2013-01-03','2013-01-04','2013-01-05','2013-01-06','2013-01-07'],
dtype='datetime64[ns]', freq='D')>>>df = pd.DataFrame(np.random.randn(6,4),index = date,columns=list('ABCD'))>>>df
Out[15]:
A B C D
2013-01-02-1.282088-0.673884-0.2606900.8335512013-01-03-1.0808490.492077-0.894507-2.5575782013-01-04-0.5775830.8553890.1805340.0403792013-01-050.7367441.2216971.3861260.0651652013-01-06-0.979877-2.400284-0.1278890.2218252013-01-07-0.6381501.503574-1.4400530.733958
通过传递一个字典对象来生成一个DataFrame:
>>>data = pd.DataFrame({
'A':1,'B':pd.Timestamp('20200708'),'C':pd.Series(1,index =list('ABCD')),'D':np.array([3]*4),'E':'foo','F':pd.Categorical(['test','train','test','train'])})>>>data
Out[19]:
A B C D E F
A 12020-07-0813 foo test
B 12020-07-0813 foo train
C 12020-07-0813 foo test
D 12020-07-0813 foo train
查看不同列的数据类型:
>>>data.dtypes
>>>
A int64
B datetime64[ns]
C int64
D int32
E object
F category
dtype:object
>>>data
>>>Out[23]:
A B C D E F
A 12020-07-0813 foo test
B 12020-07-0813 foo train
C 12020-07-0813 foo test
D 12020-07-0813 foo train
>>>data.head()>>>Out[24]:
A B C D E F
A 12020-07-0813 foo test
B 12020-07-0813 foo train
C 12020-07-0813 foo test
D 12020-07-0813 foo train
>>>data.tail(3)>>>Out[25]:
A B C D E F
B 12020-07-0813 foo train
C 12020-07-0813 foo test
D 12020-07-0813 foo train