Pandas 练习

首先导入库及数据

import pandas  as pd
import numpy as np
df = pd.read_csv('data/2002年-2018年上海机动车拍照拍卖.csv')
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 203 entries, 0 to 202
Data columns (total 5 columns):
 #   Column                          Non-Null Count  Dtype 
---  ------                          --------------  ----- 
 0   Date                            203 non-null    object
 1   Total number of license issued  203 non-null    int64 
 2   lowest price                    203 non-null    int64 
 3   avg price                       203 non-null    int64 
 4   Total number of applicants      203 non-null    int64 
dtypes: int64(4), object(1)
memory usage: 8.1+ KB
df.head()
DateTotal number of license issuedlowest priceavg priceTotal number of applicants
02-Jan140013600147353718
12-Feb180013100140574590
22-Mar200014300146625190
32-Apr230016000163344806
42-May235017800183574665

(1) 哪一次拍卖的中标率首次小于 5%?

df['Total number of license issued']/df['Total number of applicants']
0      0.376547
1      0.392157
2      0.385356
3      0.478568
4      0.503751
         ...   
198    0.053965
199    0.067209
200    0.058990
201    0.066342
202    0.077671
Length: 203, dtype: float64
df[df['Total number of license issued']/df['Total number of applicants'] < 0.05].head(1)
DateTotal number of license issuedlowest priceavg priceTotal number of applicants
15915-May74827900079099156007

(2) 将第一列时间列拆分成两个列,一列为年份(格式为 20××),另一列为 月份(英语缩写)

years = []
months =[]
for i in list(df['Date'].values):
    years.append(i.split('-')[0])
    months.append(i.split('-')[1])
print(years)
['2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '4', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '5', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '6', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '7', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '8', '9', '9', '9', '9', '9', '9', '9', '9', '9', '9', '9', '9', '10', '10', '10', '10', '10', '10', '10', '10', '10', '10', '10', '10', '11', '11', '11', '11', '11', '11', '11', '11', '11', '11', '11', '11', '12', '12', '12', '12', '12', '12', '12', '12', '12', '12', '12', '12', '13', '13', '13', '13', '13', '13', '13', '13', '13', '13', '13', '13', '14', '14', '14', '14', '14', '14', '14', '14', '14', '14', '14', '14', '15', '15', '15', '15', '15', '15', '15', '15', '15', '15', '15', '15', '16', '16', '16', '16', '16', '16', '16', '16', '16', '16', '16', '16', '17', '17', '17', '17', '17', '17', '17', '17', '17', '17', '17', '17', '18', '18', '18', '18', '18', '18', '18', '18', '18', '18', '18', '18']
df['Years'] = years
df['months'] = months
df['Years']= df['Years'].apply(lambda x: '200' + str(x) if len(str(x)) == 1 else '20'+str(x) )
df
DateTotal number of license issuedlowest priceavg priceTotal number of applicantsYearsmonths
02-Jan1400136001473537182002Jan
12-Feb1800131001405745902002Feb
22-Mar2000143001466251902002Mar
32-Apr2300160001633448062002Apr
42-May2350178001835746652002May
........................
19818-Aug1040288300883651927552018Aug
19918-Sep1271287300874101891422018Sep
20018-Oct1072888000880701818612018Oct
20118-Nov1176687300873741773552018Nov
20218-Dec1285087400875081654422018Dec

203 rows × 7 columns

(3) 按年统计拍卖最低价的下列统计量:最大值、均值、0.75 分位数

df.groupby('Years').describe(percentiles=[.75])
Total number of license issuedlowest price...avg priceTotal number of applicants
countmeanstdmin50%75%maxcountmeanstd...50%75%maxcountmeanstdmin50%75%max
Years
200212.02654.166667674.0172421400.02900.03200.003600.012.020316.6666675825.466557...20541.024790.0031721.012.04373.750000503.5719913525.04546.04662.005190.0
200312.04422.3333331237.6684343000.04500.05156.506650.012.031983.3333335926.033985...35874.038107.7539369.012.010927.0833332253.2748888532.010170.011954.2515507.0
200412.05966.666667787.8788944800.06380.06600.006800.012.029408.33333310235.007513...30157.539650.2545492.012.011048.5000003482.1098758114.09734.511591.5019233.0
200512.05589.833333937.0488333800.05700.06081.506829.012.031908.3333334766.828245...35172.536900.5038378.012.09240.7500001940.3771486208.08863.09997.7513633.0
200612.05375.000000950.7174323800.05250.06275.006500.012.037058.3333333607.935278...38696.040089.2541601.012.017334.41666729305.1344795907.08935.09917.00110234.0
200712.06291.6666671558.8214273500.06000.07500.008500.012.045691.6666674694.766590...46739.049973.2556042.012.09965.2500002104.4121675056.010439.510625.7512943.0
200811.07681.8181823176.4187954500.06800.08600.0016000.011.029945.4545458329.869583...33224.035497.0037659.011.022616.63636415685.25639110170.016801.023774.5063534.0
200912.07216.6666671185.3908705200.08000.08000.008500.012.031333.3333333585.408138...31442.534630.7537593.012.018073.8333332107.16525814906.017543.518620.2522006.0
201012.08600.000000532.5752197500.08750.09000.009200.012.038008.3333338947.062325...40274.542200.5045291.012.015867.8333332336.78920511224.016288.017410.7518975.0
201112.08625.000000607.7155447500.09000.09000.009500.012.047958.3333334245.095209...48277.551634.7554008.012.023580.0833333153.80351819415.022400.025255.0030675.0
201212.09025.000000679.7392558000.09500.09500.009500.012.061108.3333335091.786315...62092.566495.7569346.012.022391.8333332747.91312018244.023048.524459.0026526.0
201312.09166.666667685.3444178500.09000.09000.0011000.012.079125.0000005149.426270...77144.083609.0091898.012.027110.3333336808.14750320857.024120.030453.7539625.0
201412.07603.916667362.2185327400.07400.07610.258300.012.073816.666667392.737093...73873.574084.5074680.012.097630.16666732370.02923341946.0101252.0121717.25136098.0
201512.07745.583333403.4549567406.07592.07819.758727.012.080575.0000003873.951471...81465.583521.2585424.012.0152743.33333327089.89240398203.0166033.5169618.00179133.0
201612.011033.2500001490.0687678310.011549.011673.0012889.012.085733.3333332216.604009...85825.087516.0088665.012.0232111.33333328913.748423187533.0225326.5252615.25277889.0
201712.011115.416667900.86266310157.010780.012159.2512413.012.090616.6666672099.278375...90812.092399.5093540.012.0249034.33333314597.264995226911.0251141.5257564.75270197.0
201812.011241.3333331011.9004029855.010936.511982.7512850.012.087825.000000570.685393...87918.088223.2589018.012.0198864.50000018424.422554165442.0200482.0211518.00226316.0

17 rows × 28 columns

(4)现在将表格行索引设为多级索引,外层为年份,内层为原表格第二至第 五列的变量名,列索引为月份。

L1 = ['Years']
L2 = ['Total number of license issued', 'lowest price ', 'avg price', 'Total number of applicants']
mul_index = pd.MultiIndex.from_product([L1,L2],names=('Upper', 'Lower'))
pd.pivot_table(df,index=['Years'],
               columns=['months'])
Total number of applicants...lowest price
monthsAprAugDecFebJanJulJunMarMayNov...DecFebJanJulJunMarMayNovOctSep
Years
20024806.04640.03525.04590.03718.03774.04502.05190.04665.04021.0...27800.013100.013600.019800.019600.014300.017800.030800.026400.023600.0
20038794.09315.010491.012030.09442.011929.015507.011219.014634.09849.0...37100.023800.018800.036900.036100.028800.035000.033100.032800.028800.0
20048150.015506.09005.010156.08663.014464.019233.09950.08114.09188.0...29300.039600.038000.021800.017800.043000.010800.026000.028000.029300.0
20058113.07520.08351.08949.06208.08777.08409.09117.09673.013633.0...35200.031700.028500.037900.037000.034300.035000.029800.025200.026500.0
20067888.09190.09477.012367.05907.08966.08478.08904.08301.0110234.0...39800.034200.026900.039600.039500.038500.037700.037800.036300.037000.0
200710523.012943.010356.05056.06587.010327.011478.010168.010273.010596.0...50000.039100.038500.045200.047200.041100.044500.053800.050500.048600.0
200837072.013451.016801.0NaN20539.016783.021208.063534.026341.010170.0...31000.0NaN8100.033800.033900.031300.034400.021800.032600.029300.0
200917654.018750.018577.016848.016544.017220.017433.018575.016471.021902.0...36900.033000.028600.032100.030000.026600.028500.034900.033900.027200.0
201017313.016855.011224.018810.018975.013389.016252.017704.016324.013429.0...10400.038300.037800.038400.039200.039600.041900.044900.043000.041800.0
201122326.021544.026531.025104.030675.021852.022474.025014.025708.020050.0...51000.044200.038300.050900.048500.046200.047400.045700.053800.052200.0
201222706.021425.018244.023391.024354.026526.024774.024897.024230.019120.0...68900.055400.052800.057700.055800.058300.064000.066400.065200.065700.0
201326174.022650.039625.024651.020857.021811.021482.023589.022224.038220.0...76000.083300.075000.076300.077600.090800.080700.075500.082300.073400.0
201494241.0121550.096972.045758.041946.0136098.0135677.061853.0114121.095595.0...73600.073200.073500.074600.073800.073800.074400.073500.074000.073800.0
2015152298.0166939.0179133.0103224.098203.0166302.0172205.0132690.0156007.0169159.0...84500.076500.074000.083100.080000.074600.079000.084600.085300.082100.0
2016256897.0251188.0219882.0196470.0187533.0240750.0275438.0221109.0277889.0215424.0...88300.083200.082200.087200.084400.083100.085000.088600.088300.086500.0
2017252273.0256083.0228148.0251717.0232101.0269189.0244349.0262010.0270197.0226911.0...92800.088200.087600.092200.089400.087800.090100.093100.093500.091300.0
2018204980.0192755.0165442.0220831.0226316.0202337.0209672.0217056.0198627.0177355.0...87400.087600.087900.088300.087800.088100.089000.087300.088000.087300.0

17 rows × 48 columns


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值