关于空难数据集的探索分析

写在前面:
这是我见过的最严肃的数据集,几乎每一行数据背后都是生命和鲜血的代价。这次探索分析并不妄图说明什么,仅仅是对数据处理能力的锻炼。因此本次的探索分析只会展示数据该有的样子而不会进行太多的评价。有一句话叫“因为珍爱和平,我们回首战争”。这里也是,因为珍爱生命,所以回首空难。现在安全的飞行是10万多无辜的人通过性命换来的,向这些伟大的探索者致敬。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

导入数据集

crash = pd.read_csv("./Airplane_Crashes_and_Fatalities_Since_1908.csv")
crash.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 13 columns):
Date            5268 non-null object
Time            3049 non-null object
Location        5248 non-null object
Operator        5250 non-null object
Flight #        1069 non-null object
Route           3562 non-null object
Type            5241 non-null object
Registration    4933 non-null object
cn/In           4040 non-null object
Aboard          5246 non-null float64
Fatalities      5256 non-null float64
Ground          5246 non-null float64
Summary         4878 non-null object
dtypes: float64(3), object(10)
memory usage: 535.1+ KB
crash = crash.drop(["Summary","cn/In","Flight #","Route","Location"],axis=1)
crash.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5268 entries, 0 to 5267
Data columns (total 8 columns):
Date            5268 non-null object
Time            3049 non-null object
Operator        5250 non-null object
Type            5241 non-null object
Registration    4933 non-null object
Aboard          5246 non-null float64
Fatalities      5256 non-null float64
Ground          5246 non-null float64
dtypes: float64(3), object(5)
memory usage: 329.3+ KB
print(crash[2200:2205])
            Date   Time                      Operator              Type  \
2200  03/06/1968   8:00     Military - U.S. Air Force  Fairchild C-123K   
2201  03/08/1968  19:18                    Air Manila    Fairchild F-27   
2202  03/09/1968  23:20   Military - French Air Force      Douglas DC6B   
2203  03/19/1968  19:37     Viking Airways - Air Taxi        Cessna 182   
2204  03/23/1968  13:00  Fortaire Aviation - Air Taxi       Brantly 305   

     Registration  Aboard  Fatalities  Ground  
2200      54-0590    49.0        49.0     0.0  
2201      PI-C871    14.0        14.0     0.0  
2202        43748    20.0        19.0     0.0  
2203       N2623F     2.0         2.0     0.0  
2204       N2224U     5.0         3.0     0.0  

伤亡分析

伤亡排序

print(crash["Fatalities"].sum())
fatal_crash = crash[crash["Fatalities"].notnull()]
fatal_crash = fatal_crash.sort_values(by="Fataliti
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值