集成学习:XGBoost

日萌社

人工智能AI:Keras PyTorch MXNet TensorFlow PaddlePaddle 深度学习实战(不定时更新)


集成学习:Bagging、随机森林、Boosting、GBDT

集成学习:XGBoost

集成学习:lightGBM(一)

集成学习:lightGBM(二)


5.1 xgboost算法原理

XGBoost(Extreme Gradient Boosting)全名叫极端梯度提升树,XGBoost是集成学习方法的王牌,在Kaggle数据挖掘比赛中,大部分获胜者用了XGBoost。

XGBoost在绝大多数的回归和分类问题上表现的十分顶尖,本节将较详细的介绍XGBoost的算法原理。

1 最优模型的构建方法

我们在前面已经知道,构建最优模型的一般方法是最小化训练数据的损失函数

我们用字母 L表示损失,如下式:

其中,F是假设空间

假设空间是在已知属性和属性可能取值的情况下,对所有可能满足目标的情况的一种毫无遗漏的假设集合。

式(1.1)称为经验风险最小化,训练得到的模型复杂度较高。当训练数据较小时,模型很容易出现过拟合问题。

因此,为了降低模型的复杂度,常采用下式:

其中J(f)J(f)为模型的复杂度,

式(2.1)称为结构风险最小化,结构风险最小化的模型往往对训练数据以及未知的测试数据都有较好的预测 。

应用:

  • 决策树的生成和剪枝分别对应了经验风险最小化和结构风险最小化,
  • XGBoost的决策树生成是结构风险最小化的结果,后续会详细介绍。

2 XGBoost的目标函数推导

2.1 目标函数确定

目标函数,即损失函数,通过最小化损失函数来构建最优模型。

由前面可知, 损失函数应加上表示模型复杂度的正则项,且XGBoost对应的模型包含了多个CART树,因此,模型的目标函数为:

2.2 CART树的介绍

2.3 树的复杂度定义

2.3.1 定义每课树的复杂度

XGBoost法对应的模型包含了多棵cart树,定义每棵树的复杂度:

2.3.2 树的复杂度举例

假设我们要预测一家人对电子游戏的喜好程度,考虑到年轻和年老相比,年轻更可能喜欢电子游戏,以及男性和女性相比,男性更喜欢电子游戏,故先根据年龄大小区分小孩和大人,然后再通过性别区分开是男是女,逐一给各人在电子游戏喜好程度上打分,如下图所示:

就这样,训练出了2棵树tree1和tree2,类似之前gbdt的原理,两棵树的结论累加起来便是最终的结论,所以:

  • 小男孩的预测分数就是两棵树中小孩所落到的结点的分数相加:2 + 0.9 = 2.9。
  • 爷爷的预测分数同理:-1 + (-0.9)= -1.9。

具体如下图所示:

2.4 目标函数推导

3 XGBoost的回归树构建方法

3.1 计算分裂节点

在实际训练过程中,当建立第 t 棵树时,XGBoost采用贪心法进行树结点的分裂:

从树深为0时开始:

  • 对树中的每个叶子结点尝试进行分裂;

  • 每次分裂后,原来的一个叶子结点继续分裂为左右两个子叶子结点,原叶子结点中的样本集将根据该结点的判断规则分散到左右两个叶子结点中;

  • 新分裂一个结点后,我们需要检测这次分裂是否会给损失函数带来增益,增益的定义如下:

如果增益Gain>0,即分裂为两个叶子节点后,目标函数下降了,那么我们会考虑此次分裂的结果。

那么一直这样分裂,什么时候才会停止呢?

3.2 停止分裂条件判断

情况一:上节推导得到的打分函数是衡量树结构好坏的标准,因此,可用打分函数来选择最佳切分点。首先确定样本特征的所有切分点,对每一个确定的切分点进行切分,切分好坏的标准如下:

4 XGBoost与GDBT的区别

  • 区别一:
    • XGBoost生成CART树考虑了树的复杂度,
    • GDBT未考虑,GDBT在树的剪枝步骤中考虑了树的复杂度。
  • 区别二:
    • XGBoost是拟合上一轮损失函数的二阶导展开,GDBT是拟合上一轮损失函数的一阶导展开,因此,XGBoost的准确性更高,且满足相同的训练效果,需要的迭代次数更少。
  • 区别三:
    • XGBoost与GDBT都是逐次迭代来提高模型性能,但是XGBoost在选取最佳切分点时可以开启多线程进行,大大提高了运行速度。

5 小结


5.2 xgboost算法api介绍

1 xgboost的安装:

官网链接:https://xgboost.readthedocs.io/en/latest/

pip3 install xgboost

2 xgboost参数介绍

xgboost虽然被称为kaggle比赛神奇,但是,我们要想训练出不错的模型,必须要给参数传递合适的值。

xgboost中封装了很多参数,主要由三种类型构成:通用参数(general parameters),Booster 参数(booster parameters)和学习目标参数(task parameters)

  • 通用参数:主要是宏观函数控制;
  • Booster参数:取决于选择的Booster类型,用于控制每一步的booster(tree, regressiong)
  • 学习目标参数:控制训练目标的表现

2.1 通用参数(general parameters)

  1. booster [缺省值=gbtree]
  2. 决定使用哪个booster,可以是gbtree,gblinear或者dart。

    • gbtree和dart使用基于树的模型(dart 主要多了 Dropout),而gblinear 使用线性函数.
  3. silent [缺省值=0]

    • 设置为0打印运行信息;设置为1静默模式,不打印
  4. nthread [缺省值=设置为最大可能的线程数]

    • 并行运行xgboost的线程数,输入的参数应该<=系统的CPU核心数,若是没有设置算法会检测将其设置为CPU的全部核心数

下面的两个参数不需要设置,使用默认的就好了

  1. num_pbuffer [xgboost自动设置,不需要用户设置]

    • 预测结果缓存大小,通常设置为训练实例的个数。该缓存用于保存最后boosting操作的预测结果。
  2. num_feature [xgboost自动设置,不需要用户设置]

    • 在boosting中使用特征的维度,设置为特征的最大维度

2.2 Booster 参数(booster parameters)

2.2.1 Parameters for Tree Booster

  1. eta [缺省值=0.3,别名:learning_rate]

    • 更新中减少的步长来防止过拟合。

    • 在每次boosting之后,可以直接获得新的特征权值,这样可以使得boosting更加鲁棒。

    • 范围: [0,1]
  2. gamma [缺省值=0,别名: min_split_loss](分裂最小loss)

    • 在节点分裂时,只有分裂后损失函数的值下降了,才会分裂这个节点。
    • Gamma指定了节点分裂所需的最小损失函数下降值。 这个参数的值越大,算法越保守。这个参数的值和损失函数息息相关,所以是需要调整的。

    • 范围: [0,∞]

  3. max_depth [缺省值=6]

    • 这个值为树的最大深度。 这个值也是用来避免过拟合的。max_depth越大,模型会学到更具体更局部的样本。设置为0代表没有限制
    • 范围: [0,∞]
  4. min_child_weight [缺省值=1]

    • 决定最小叶子节点样本权重和。XGBoost的这个参数是最小样本权重的和.
    • 当它的值较大时,可以避免模型学习到局部的特殊样本。 但是如果这个值过高,会导致欠拟合。这个参数需要使用CV来调整。.
    • 范围: [0,∞]
  5. subsample [缺省值=1]

    • 这个参数控制对于每棵树,随机采样的比例。
    • 减小这个参数的值,算法会更加保守,避免过拟合。但是,如果这个值设置得过小,它可能会导致欠拟合。

    • 典型值:0.5-1,0.5代表平均采样,防止过拟合.

    • 范围: (0,1]
  6. colsample_bytree [缺省值=1]

    • 用来控制每棵随机采样的列数的占比(每一列是一个特征)。
    • 典型值:0.5-1
    • 范围: (0,1]
  7. colsample_bylevel [缺省值=1]

    • 用来控制树的每一级的每一次分裂,对列数的采样的占比。
    • 我个人一般不太用这个参数,因为subsample参数和colsample_bytree参数可以起到相同的作用。但是如果感兴趣,可以挖掘这个参数更多的用处。
    • 范围: (0,1]
  8. lambda [缺省值=1,别名: reg_lambda]

    • 权重的L2正则化项(和Ridge regression类似)。
    • 这个参数是用来控制XGBoost的正则化部分的。虽然大部分数据科学家很少用到这个参数,但是这个参数
    • 在减少过拟合上还是可以挖掘出更多用处的。.
  9. alpha [缺省值=0,别名: reg_alpha]

    • 权重的L1正则化项。(和Lasso regression类似)。 可以应用在很高维度的情况下,使得算法的速度更快。
  10. scale_pos_weight[缺省值=1]

    • 在各类别样本十分不平衡时,把这个参数设定为一个正值,可以使算法更快收敛。通常可以将其设置为负
    • 样本的数目与正样本数目的比值。

2.2.2 Parameters for Linear Booster

linear booster一般很少用到。

  1. lambda [缺省值=0,别称: reg_lambda]

    • L2正则化惩罚系数,增加该值会使得模型更加保守。
  2. alpha [缺省值=0,别称: reg_alpha]

    • L1正则化惩罚系数,增加该值会使得模型更加保守。
  3. lambda_bias [缺省值=0,别称: reg_lambda_bias]

    • 偏置上的L2正则化(没有在L1上加偏置,因为并不重要)

2.3 学习目标参数(task parameters)

  1. objective [缺省值=reg:linear]

    1. reg:linear” – 线性回归
    2. “reg:logistic” – 逻辑回归
    3. binary:logistic” – 二分类逻辑回归,输出为概率
    4. multi:softmax” – 使用softmax的多分类器,返回预测的类别(不是概率)。在这种情况下,你还需要多设一个参数:num_class(类别数目)
    5. multi:softprob” – 和multi:softmax参数一样,但是返回的是每个数据属于各个类别的概率。
  2. eval_metric [缺省值=通过目标函数选择]

    可供选择的如下所示:

    1. rmse”: 均方根误差
    2. mae”: 平均绝对值误差
    3. logloss”: 负对数似然函数值
    4. error”: 二分类错误率。
      • 其值通过错误分类数目与全部分类数目比值得到。对于预测,预测值大于0.5被认为是正类,其它归为负类。
    5. error@t”: 不同的划分阈值可以通过 ‘t’进行设置
    6. merror”: 多分类错误率,计算公式为(wrong cases)/(all cases)
    7. mlogloss”: 多分类log损失
    8. auc”: 曲线下的面积
  3. seed [缺省值=0]

    • 随机数的种子
  • 设置它可以复现随机数据的结果,也可以用于调整参数

5.3 xgboost案例介绍

1 案例背景

该案例和前面决策树中所用案例一样。

泰坦尼克号沉没是历史上最臭名昭着的沉船事件之一。1912年4月15日,在她的处女航中,泰坦尼克号在与冰山相撞后沉没,在2224名乘客和机组人员中造成1502人死亡。这场耸人听闻的悲剧震惊了国际社会,并为船舶制定了更好的安全规定。 造成海难失事的原因之一是乘客和机组人员没有足够的救生艇。尽管幸存下沉有一些运气因素,但有些人比其他人更容易生存,例如妇女,儿童和上流社会。 在这个案例中,我们要求您完成对哪些人可能存活的分析。特别是,我们要求您运用机器学习工具来预测哪些乘客幸免于悲剧。

案例:https://www.kaggle.com/c/titanic/overview

我们提取到的数据集中的特征包括票的类别,是否存活,乘坐班次,年龄,登陆home.dest,房间,船和性别等。

数据:http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic.txt

经过观察数据得到:

  • 1 乘坐班是指乘客班(1,2,3),是社会经济阶层的代表。
  • 2 其中age数据存在缺失。

2 步骤分析

  • 1.获取数据
  • 2.数据基本处理
    • 2.1 确定特征值,目标值
    • 2.2 缺失值处理
    • 2.3 数据集划分
  • 3.特征工程(字典特征抽取)
  • 4.机器学习(xgboost)
  • 5.模型评估

3 代码实现

  • 导入需要的模块
import pandas as pd
import numpy as np
from sklearn.feature_extraction import DictVectorizer
from sklearn.model_selection import train_test_split
  • 1.获取数据
# 1、获取数据
titan = pd.read_csv("http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic.txt")
  • 2.数据基本处理

    • 2.1 确定特征值,目标值
    x = titan[["pclass", "age", "sex"]]
    y = titan["survived"]
    
    • 2.2 缺失值处理
    # 缺失值需要处理,将特征当中有类别的这些特征进行字典特征抽取
    x['age'].fillna(x['age'].mean(), inplace=True)
    
    • 2.3 数据集划分
    x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=22)
    
  • 3.特征工程(字典特征抽取)

特征中出现类别符号,需要进行one-hot编码处理(DictVectorizer)

x.to_dict(orient="records") 需要将数组特征转换成字典数据

# 对于x转换成字典数据x.to_dict(orient="records")
# [{"pclass": "1st", "age": 29.00, "sex": "female"}, {}]

transfer = DictVectorizer(sparse=False)

x_train = transfer.fit_transform(x_train.to_dict(orient="records"))
x_test = transfer.fit_transform(x_test.to_dict(orient="records"))

  • 4.xgboost模型训练和模型评估
# 模型初步训练
from xgboost import XGBClassifier
xg = XGBClassifier()

xg.fit(x_train, y_train)

xg.score(x_test, y_test)
# 针对max_depth进行模型调优
depth_range = range(10)
score = []
for i in depth_range:
    xg = XGBClassifier(eta=1, gamma=0, max_depth=i)
    xg.fit(x_train, y_train)
    s = xg.score(x_test, y_test)
    print(s)
    score.append(s)
# 结果可视化
import matplotlib.pyplot as plt

plt.plot(depth_range, score)

plt.show()


In [1]:

# 1.获取数据
# 2.数据基本处理
# 2.1 确定特征值,目标值
# 2.2 缺失值处理
# 2.3 数据集划分
# 3.特征工程(字典特征抽取)
# 4.机器学习(xgboost)
# 5.模型评估

In [2]:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction import DictVectorizer
from sklearn.tree import DecisionTreeClassifier, export_graphviz

In [3]:

# 1.获取数据
titan = pd.read_csv("http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic.txt")

In [4]:

titan

Out[4]:

row.namespclasssurvivednameageembarkedhome.destroomticketboatsex
011st1Allen, Miss Elisabeth Walton29.0000SouthamptonSt Louis, MOB-524160 L2212female
121st0Allison, Miss Helen Loraine2.0000SouthamptonMontreal, PQ / Chesterville, ONC26NaNNaNfemale
231st0Allison, Mr Hudson Joshua Creighton30.0000SouthamptonMontreal, PQ / Chesterville, ONC26NaN(135)male
341st0Allison, Mrs Hudson J.C. (Bessie Waldo Daniels)25.0000SouthamptonMontreal, PQ / Chesterville, ONC26NaNNaNfemale
451st1Allison, Master Hudson Trevor0.9167SouthamptonMontreal, PQ / Chesterville, ONC22NaN11male
561st1Anderson, Mr Harry47.0000SouthamptonNew York, NYE-12NaN3male
671st1Andrews, Miss Kornelia Theodosia63.0000SouthamptonHudson, NYD-713502 L7710female
781st0Andrews, Mr Thomas, jr39.0000SouthamptonBelfast, NIA-36NaNNaNmale
891st1Appleton, Mrs Edward Dale (Charlotte Lamson)58.0000SouthamptonBayside, Queens, NYC-101NaN2female
9101st0Artagaveytia, Mr Ramon71.0000CherbourgMontevideo, UruguayNaNNaN(22)male
10111st0Astor, Colonel John Jacob47.0000CherbourgNew York, NYNaN17754 L224 10s 6d(124)male
11121st1Astor, Mrs John Jacob (Madeleine Talmadge Force)19.0000CherbourgNew York, NYNaN17754 L224 10s 6d4female
12131st1Aubert, Mrs Leontine PaulineNaNCherbourgParis, FranceB-3517477 L69 6s9female
13141st1Barkworth, Mr Algernon H.NaNSouthamptonHessle, YorksA-23NaNBmale
14151st0Baumann, Mr John D.NaNSouthamptonNew York, NYNaNNaNNaNmale
15161st1Baxter, Mrs James (Helene DeLaudeniere Chaput)50.0000CherbourgMontreal, PQB-58/60NaN6female
16171st0Baxter, Mr Quigg Edmond24.0000CherbourgMontreal, PQB-58/60NaNNaNmale
17181st0Beattie, Mr Thomson36.0000CherbourgWinnipeg, MNC-6NaNNaNmale
18191st1Beckwith, Mr Richard Leonard37.0000SouthamptonNew York, NYD-35NaN5male
19201st1Beckwith, Mrs Richard Leonard (Sallie Monypeny)47.0000SouthamptonNew York, NYD-35NaN5female
20211st1Behr, Mr Karl Howell26.0000CherbourgNew York, NYC-148NaN5male
21221st0Birnbaum, Mr Jakob25.0000CherbourgSan Francisco, CANaNNaN(148)male
22231st1Bishop, Mr Dickinson H.25.0000CherbourgDowagiac, MIB-49NaN7male
23241st1Bishop, Mrs Dickinson H. (Helen Walton)19.0000CherbourgDowagiac, MIB-49NaN7female
24251st1Bjornstrm-Steffansson, Mr Mauritz Hakan28.0000SouthamptonStockholm, Sweden / Washington, DCNaNDmale
25261st0Blackwell, Mr Stephen Weart45.0000SouthamptonTrenton, NJNaNNaN(241)male
26271st1Blank, Mr Henry39.0000CherbourgGlen Ridge, NJA-31NaN7male
27281st1Bonnell, Miss Caroline30.0000SouthamptonYoungstown, OHC-7NaN8female
28291st1Bonnell, Miss Elizabeth58.0000SouthamptonBirkdale, England Cleveland, OhioC-103NaN8female
29301st0Borebank, Mr John JamesNaNSouthamptonLondon / Winnipeg, MBD-21/2NaNNaNmale
....................................
128312843rd0Vestrom, Miss Hulda Amanda AdolfinaNaNNaNNaNNaNNaNNaNfemale
128412853rd0Vonk, Mr JenkoNaNNaNNaNNaNNaNNaNmale
128512863rd0Ware, Mr FrederickNaNNaNNaNNaNNaNNaNmale
128612873rd0Warren, Mr Charles WilliamNaNNaNNaNNaNNaNNaNmale
128712883rd0Wazli, Mr YousifNaNNaNNaNNaNNaNNaNmale
128812893rd0Webber, Mr JamesNaNNaNNaNNaNNaNNaNmale
128912903rd1Wennerstrom, Mr August EdvardNaNNaNNaNNaNNaNNaNmale
129012913rd0Wenzel, Mr LinhartNaNNaNNaNNaNNaNNaNmale
129112923rd0Widegren, Mr Charles PeterNaNNaNNaNNaNNaNNaNmale
129212933rd0Wiklund, Mr Jacob AlfredNaNNaNNaNNaNNaNNaNmale
129312943rd1Wilkes, Mrs EllenNaNNaNNaNNaNNaNNaNfemale
129412953rd0Willer, Mr AaronNaNNaNNaNNaNNaNNaNmale
129512963rd0Willey, Mr EdwardNaNNaNNaNNaNNaNNaNmale
129612973rd0Williams, Mr Howard HughNaNNaNNaNNaNNaNNaNmale
129712983rd0Williams, Mr LeslieNaNNaNNaNNaNNaNNaNmale
129812993rd0Windelov, Mr EinarNaNNaNNaNNaNNaNNaNmale
129913003rd0Wirz, Mr AlbertNaNNaNNaNNaNNaNNaNmale
130013013rd0Wiseman, Mr PhillippeNaNNaNNaNNaNNaNNaNmale
130113023rd0Wittevrongel, Mr CamielNaNNaNNaNNaNNaNNaNmale
130213033rd1Yalsevac, Mr IvanNaNNaNNaNNaNNaNNaNmale
130313043rd0Yasbeck, Mr AntoniNaNNaNNaNNaNNaNNaNmale
130413053rd1Yasbeck, Mrs AntoniNaNNaNNaNNaNNaNNaNfemale
130513063rd0Youssef, Mr GeriosNaNNaNNaNNaNNaNNaNmale
130613073rd0Zabour, Miss HileniNaNNaNNaNNaNNaNNaNfemale
130713083rd0Zabour, Miss TaminiNaNNaNNaNNaNNaNNaNfemale
130813093rd0Zakarian, Mr ArtunNaNNaNNaNNaNNaNNaNmale
130913103rd0Zakarian, Mr MapriederNaNNaNNaNNaNNaNNaNmale
131013113rd0Zenn, Mr PhilipNaNNaNNaNNaNNaNNaNmale
131113123rd0Zievens, ReneNaNNaNNaNNaNNaNNaNfemale
131213133rd0Zimmerman, LeoNaNNaNNaNNaNNaNNaNmale

1313 rows × 11 columns

In [5]:

titan.describe()

Out[5]:

row.namessurvivedage
count1313.0000001313.000000633.000000
mean657.0000000.34196531.194181
std379.1747620.47454914.747525
min1.0000000.0000000.166700
25%329.0000000.00000021.000000
50%657.0000000.00000030.000000
75%985.0000001.00000041.000000
max1313.0000001.00000071.000000

In [6]:

# 2.数据基本处理
# 2.1 确定特征值,目标值
x = titan[["pclass", "age", "sex"]]
y = titan["survived"]

In [7]:

x.head()

Out[7]:

pclassagesex
01st29.0000female
11st2.0000female
21st30.0000male
31st25.0000female
41st0.9167male

In [8]:

y.head()

Out[8]:

0    1
1    0
2    0
3    0
4    1
Name: survived, dtype: int64

In [9]:

# 2.2 缺失值处理
x['age'].fillna(value=titan["age"].mean(), inplace=True)

In [10]:

x.head()

Out[10]:

pclassagesex
01st29.0000female
11st2.0000female
21st30.0000male
31st25.0000female
41st0.9167male

In [11]:

# 2.3 数据集划分
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=22, test_size=0.2)

In [12]:

# 3.特征工程(字典特征抽取)

In [13]:

x_train.head()

Out[13]:

pclassagesex
6493rd45.000000female
10783rd31.194181male
591st31.194181female
2011st18.000000male
611st31.194181female

In [14]:

x_train = x_train.to_dict(orient="records")
x_test = x_test.to_dict(orient="records")

In [15]:

x_train

Out[15]:

[{'pclass': '3rd', 'age': 45.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 18.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 6.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 27.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 4.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 13.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 30.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 50.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 22.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 49.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 62.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 32.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 64.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 55.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 6.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 10.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 53.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 36.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 19.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 17.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 21.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 25.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 21.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 48.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 27.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 46.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 29.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 35.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 38.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 16.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 16.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 33.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 17.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 33.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 52.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 35.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 45.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 50.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 52.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 20.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 32.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 34.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 33.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 45.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 43.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 59.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 47.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 38.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 51.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 36.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 6.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 58.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 4.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 35.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 12.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 64.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 27.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 34.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 50.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 34.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 21.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 44.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 39.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 42.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 69.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 2.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 47.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 42.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 21.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 48.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 45.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 39.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 14.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 54.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 36.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 47.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 0.8333, 'sex': 'male'},
 {'pclass': '1st', 'age': 53.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 24.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 37.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 40.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 22.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 29.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 55.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 49.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 24.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 54.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 38.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 42.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 52.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 8.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 57.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 22.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 16.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 45.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 28.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 19.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 24.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 38.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 36.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 55.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 25.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 9.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 29.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 40.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 39.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 49.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 17.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 40.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 6.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 17.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 34.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 41.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 61.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 17.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 3.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 24.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 41.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 42.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 50.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 16.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 40.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 23.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 34.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 39.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 34.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 9.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 22.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 25.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 26.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 57.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 39.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 35.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 41.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 67.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 11.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 20.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 50.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 33.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 36.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 59.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 17.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 49.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 33.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 46.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 52.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 36.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 19.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 43.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 51.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 3.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 48.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 16.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 44.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 36.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 37.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 32.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 22.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 40.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 65.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 37.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 52.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 0.8333, 'sex': 'male'},
 {'pclass': '2nd', 'age': 35.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 27.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 27.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 41.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 33.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 56.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 40.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 9.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 28.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 48.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 35.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 1.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 2.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 29.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 27.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 38.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 0.9167, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 39.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 9.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 14.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 30.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 60.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 30.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 46.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 27.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 61.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 39.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 0.1667, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 15.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 17.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 42.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 20.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 62.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 49.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 23.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 33.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 70.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 37.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 54.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 51.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 21.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 64.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 29.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 33.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 50.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 59.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 49.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 38.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 54.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 19.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 3.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 22.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 34.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 28.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 15.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 40.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 46.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 8.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 63.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 43.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 16.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 38.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 1.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 35.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 42.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 38.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 17.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 40.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 4.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 29.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 22.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 57.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 40.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 47.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 37.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 42.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 5.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 21.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 40.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 9.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 41.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 28.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 39.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 35.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 45.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 50.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 56.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 22.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 50.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 24.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 40.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 52.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 11.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 26.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 40.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 49.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 18.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 9.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 35.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 32.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 32.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 45.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 26.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 21.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 27.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 24.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 18.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 56.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 16.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 64.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 46.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 46.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 29.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 33.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 34.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 0.8333, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 58.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 60.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 44.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 71.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 13.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 58.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 4.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 16.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 33.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 33.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 48.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 28.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 55.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 54.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 71.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 47.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 21.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 24.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 23.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 18.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 54.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 17.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 6.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 45.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 36.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 55.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 26.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '1st', 'age': 65.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 27.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 22.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '2nd', 'age': 7.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 30.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 39.0, 'sex': 'female'},
 {'pclass': '1st', 'age': 19.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 19.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 20.0, 'sex': 'male'},
 {'pclass': '1st', 'age': 56.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 38.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'female'},
 {'pclass': '2nd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 42.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 23.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 25.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '2nd', 'age': 16.0, 'sex': 'male'},
 {'pclass': '2nd', 'age': 42.0, 'sex': 'male'},
 {'pclass': '3rd', 'age': 2.0, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'female'},
 {'pclass': '3rd', 'age': 31.19418104265403, 'sex': 'male'},
 {'pclass': '1st', 'age': 36.0, 'sex': 'male'},
 ...]

In [16]:

transfer = DictVectorizer()

x_train = transfer.fit_transform(x_train)
x_test = transfer.fit_transform(x_test)

In [21]:

# 4.xgboost模型训练
# 4.1 初步模型训练
from xgboost import XGBClassifier

xg = XGBClassifier()

xg.fit(x_train, y_train)

Out[21]:

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0,
              learning_rate=0.1, max_delta_step=0, max_depth=3,
              min_child_weight=1, missing=None, n_estimators=100, n_jobs=1,
              nthread=None, objective='binary:logistic', random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
              silent=None, subsample=1, verbosity=1)

In [22]:

xg.score(x_test, y_test)

Out[22]:

0.7832699619771863

In [23]:

# 4.2 对max_depth进行调优

depth_range  = range(10)
score = []

for i in depth_range:
    xg = XGBClassifier(eta=1, gamma=0, max_depth=i)
    xg.fit(x_train, y_train)
    
    s = xg.score(x_test, y_test)
    
    print(s)
    score.append(s)

0.6311787072243346
0.7908745247148289
0.7870722433460076
0.7832699619771863
0.7870722433460076
0.7908745247148289
0.7908745247148289
0.7946768060836502
0.7908745247148289
0.7946768060836502

In [25]:

# 4.3 调优结果可视化
import matplotlib.pyplot as plt

plt.plot(depth_range, score)

plt.show()


5.4 otto案例介绍 -- Otto Group Product Classification Challenge【xgboost实现】

1 背景介绍

奥托集团是世界上最大的电子商务公司之一,在20多个国家设有子公司。该公司每天都在世界各地销售数百万种产品,所以对其产品根据性能合理的分类非常重要。

不过,在实际工作中,工作人员发现,许多相同的产品得到了不同的分类。本案例要求,你对奥拓集团的产品进行正确的分分类。尽可能的提供分类的准确性。

链接:https://www.kaggle.com/c/otto-group-product-classification-challenge/overview

2 思路分析

  • 1.数据获取

  • 2.数据基本处理

    • 2.1 截取部分数据
    • 2.2 把标签纸转换为数字
    • 2.3 分割数据(使用StratifiedShuffleSplit)
    • 2.4 数据标准化
    • 2.5 数据pca降维
  • 3.模型训练

    • 3.1 基本模型训练
    • 3.2 模型调优
      • 3.2.1 调优参数:
        • n_estimator,
        • max_depth,
        • min_child_weights,
        • subsamples,
        • consample_bytrees,
        • etas
      • 3.2.2 确定最后最优参数

3 部分代码实现

  • 2.数据基本处理

    • 2.1 截取部分数据

    • 2.2 把标签纸转换为数字

    • 2.3 分割数据(使用StratifiedShuffleSplit)

      # 使用StratifiedShuffleSplit对数据集进行分割
      from sklearn.model_selection import StratifiedShuffleSplit
      
      sss = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=0)
      for train_index, test_index in sss.split(X_resampled.values, y_resampled):
          print(len(train_index))
          print(len(test_index))
      
          x_train = X_resampled.values[train_index]
          x_val = X_resampled.values[test_index]
      
          y_train = y_resampled[train_index]
          y_val = y_resampled[test_index]
      
      # 分割数据图形可视化
      import seaborn as sns
      
      sns.countplot(y_val)
      
      plt.show()
      
    • 2.4 数据标准化

      from sklearn.preprocessing import StandardScaler
      
      scaler = StandardScaler()
      scaler.fit(x_train)
      
      x_train_scaled = scaler.transform(x_train)
      x_val_scaled = scaler.transform(x_val)
      
    • 2.5 数据pca降维

      print(x_train_scaled.shape)
      # (13888, 93)
      
      from sklearn.decomposition import PCA
      
      pca = PCA(n_components=0.9)
      x_train_pca = pca.fit_transform(x_train_scaled)
      x_val_pca = pca.transform(x_val_scaled)
      
      print(x_train_pca.shape, x_val_pca.shape)
      (13888, 65) (3473, 65)
      

      从上面输出的数据可以看出,只选择65个元素,就可以表达出特征中90%的信息

      # 降维数据可视化
      plt.plot(np.cumsum(pca.explained_variance_ratio_))
      
      plt.xlabel("元素数量")
      plt.ylabel("可表达信息的百分占比")
      
      plt.show()

  • 3.模型训练

    • 3.1 基本模型训练

      from xgboost import XGBClassifier
      
      xgb = XGBClassifier()
      xgb.fit(x_train_pca, y_train)
      
      # 改变预测值的输出模式,让输出结果为百分占比,降低logloss值
      y_pre_proba = xgb.predict_proba(x_val_pca)
      
      # logloss进行模型评估
      from sklearn.metrics import log_loss
      log_loss(y_val, y_pre_proba, eps=1e-15, normalize=True)
      
      xgb.get_params
      
  • 3.2 模型调优

    • 3.2.1 调优参数:

      • n_estimator,

        scores_ne = []
        n_estimators = [100,200,400,450,500,550,600,700]
        
        for nes in n_estimators:
            print("n_estimators:", nes)
            xgb = XGBClassifier(max_depth=3, 
                                learning_rate=0.1, 
                                n_estimators=nes, 
                                objective="multi:softprob", 
                                n_jobs=-1, 
                                nthread=4, 
                                min_child_weight=1, 
                                subsample=1, 
                                colsample_bytree=1,
                                seed=42)
        
            xgb.fit(x_train_pca, y_train)
            y_pre = xgb.predict_proba(x_val_pca)
            score = log_loss(y_val, y_pre)
            scores_ne.append(score)
            print("测试数据的logloss值为:{}".format(score))
        
        # 数据变化可视化
        plt.plot(n_estimators, scores_ne, "o-")
        
        plt.ylabel("log_loss")
        plt.xlabel("n_estimators")
        print("n_estimators的最优值为:{}".format(n_estimators[np.argmin(scores_ne)]))
        

      • max_depth,

        scores_md = []
        max_depths = [1,3,5,6,7]
        
        for md in max_depths:  # 修改
            xgb = XGBClassifier(max_depth=md, # 修改
                                learning_rate=0.1, 
                                n_estimators=n_estimators[np.argmin(scores_ne)],   # 修改 
                                objective="multi:softprob", 
                                n_jobs=-1, 
                                nthread=4, 
                                min_child_weight=1, 
                                subsample=1, 
                                colsample_bytree=1,
                                seed=42)
        
            xgb.fit(x_train_pca, y_train)
            y_pre = xgb.predict_proba(x_val_pca)
            score = log_loss(y_val, y_pre)
            scores_md.append(score)  # 修改
            print("测试数据的logloss值为:{}".format(log_loss(y_val, y_pre)))
        
        # 数据变化可视化
        plt.plot(max_depths, scores_md, "o-")  # 修改
        
        plt.ylabel("log_loss")
        plt.xlabel("max_depths")  # 修改
        print("max_depths的最优值为:{}".format(max_depths[np.argmin(scores_md)]))  # 修改
        
      • min_child_weights,

        • 依据上面模式进行调整
      • subsamples,

      • consample_bytrees,

      • etas

    • 3.2.2 确定最后最优参数

      xgb = XGBClassifier(learning_rate =0.1, 
                          n_estimators=550, 
                          max_depth=3, 
                          min_child_weight=3, 
                          subsample=0.7, 
                          colsample_bytree=0.7, 
                          nthread=4, 
                          seed=42, 
                          objective='multi:softprob')
      xgb.fit(x_train_scaled, y_train)
      
      y_pre = xgb.predict_proba(x_val_scaled)
      
      print("测试数据的logloss值为 : {}".format(log_loss(y_val, y_pre, eps=1e-15, normalize=True)))
      

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

数据获取

In [2]:

data = pd.read_csv("./data/otto/train.csv")

In [3]:

data.head()

Out[3]:

idfeat_1feat_2feat_3feat_4feat_5feat_6feat_7feat_8feat_9...feat_85feat_86feat_87feat_88feat_89feat_90feat_91feat_92feat_93target
01100000000...100000000Class_1
12000000010...000000000Class_1
23000000010...000000000Class_1
34100161500...012000000Class_1
45000000000...100001000Class_1

5 rows × 95 columns

In [4]:

data.shape

Out[4]:

(61878, 95)

In [5]:

data.describe()

Out[5]:

idfeat_1feat_2feat_3feat_4feat_5feat_6feat_7feat_8feat_9...feat_84feat_85feat_86feat_87feat_88feat_89feat_90feat_91feat_92feat_93
count61878.00000061878.0000061878.00000061878.00000061878.00000061878.00000061878.00000061878.00000061878.00000061878.000000...61878.00000061878.00000061878.00000061878.00000061878.00000061878.00000061878.00000061878.00000061878.00000061878.000000
mean30939.5000000.386680.2630660.9014670.7790810.0710430.0256960.1937040.6624331.011296...0.0707520.5323061.1285760.3935490.8749150.4577720.8124210.2649410.3801190.126135
std17862.7843151.525331.2520732.9348182.7880050.4389020.2153331.0301022.2557703.474822...1.1514601.9004382.6815541.5754552.1154661.5273854.5978042.0456460.9823851.201720
min1.0000000.000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
25%15470.2500000.000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
50%30939.5000000.000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
75%46408.7500000.000000.0000000.0000000.0000000.0000000.0000000.0000001.0000000.000000...0.0000000.0000001.0000000.0000001.0000000.0000000.0000000.0000000.0000000.000000
max61878.00000061.0000051.00000064.00000070.00000019.00000010.00000038.00000076.00000043.000000...76.00000055.00000065.00000067.00000030.00000061.000000130.00000052.00000019.00000087.000000

8 rows × 94 columns

In [6]:

# 图形可视化,查看数据分布
import seaborn as sns

sns.countplot(data.target)

plt.show()

由上图可以看出,该数据类别不均衡,所以需要后期处理

数据基本处理

数据已经经过脱敏,不再需要特殊处理

截取部分数据

In [7]:

new1_data = data[:10000]
new1_data.shape

Out[7]:

(10000, 95)

In [8]:

# 图形可视化,查看数据分布
import seaborn as sns

sns.countplot(new1_data.target)

plt.show()

使用上面方式获取数据不可行,然后使用随机欠采样获取响应的数据

In [9]:

# 随机欠采样获取数据
# 首先需要确定特征值\标签值

y = data["target"]
x = data.drop(["id", "target"], axis=1)

In [10]:

x.head()

Out[10]:

feat_1feat_2feat_3feat_4feat_5feat_6feat_7feat_8feat_9feat_10...feat_84feat_85feat_86feat_87feat_88feat_89feat_90feat_91feat_92feat_93
01000000000...0100000000
10000000100...0000000000
20000000100...0000000000
31001615001...22012000000
40000000000...0100001000

5 rows × 93 columns

In [11]:

y.head()

Out[11]:

0    Class_1
1    Class_1
2    Class_1
3    Class_1
4    Class_1
Name: target, dtype: object

In [12]:

# 欠采样获取数据
from imblearn.under_sampling import RandomUnderSampler

rus = RandomUnderSampler(random_state=0)

X_resampled, y_resampled = rus.fit_resample(x, y)

In [13]:

x.shape, y.shape

Out[13]:

((61878, 93), (61878,))

In [14]:

X_resampled.shape, y_resampled.shape

Out[14]:

((17361, 93), (17361,))

In [15]:

# 图形可视化,查看数据分布
import seaborn as sns

sns.countplot(y_resampled)

plt.show()

把标签值转换为数字

In [16]:

y_resampled.head()

Out[16]:

0    Class_1
1    Class_1
2    Class_1
3    Class_1
4    Class_1
Name: target, dtype: object

In [17]:

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
y_resampled = le.fit_transform(y_resampled)
 

In [18]:

y_resampled

Out[18]:

array([0, 0, 0, ..., 8, 8, 8])

分割数据

In [19]:

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(X_resampled, y_resampled, test_size=0.2)

In [20]:

x_train.shape, y_train.shape

Out[20]:

((13888, 93), (13888,))

In [21]:

x_test.shape, y_test.shape

Out[21]:

((3473, 93), (3473,))

In [22]:

# 1.数据获取

# 2.数据基本处理

    # 2.1 截取部分数据
    # 2.2 把标签纸转换为数字
    # 2.3 分割数据(使用StratifiedShuffleSplit)
    # 2.4 数据标准化
    # 2.5 数据pca降维

# 3.模型训练
    # 3.1 基本模型训练
    # 3.2 模型调优
        # 3.2.1 调优参数:
            # n_estimator,
            # max_depth,
            # min_child_weights,
            # subsamples,
            # consample_bytrees,
            # etas
        # 3.2.2 确定最后最优参数
    

In [23]:

# 图形可视化
import seaborn as sns

sns.countplot(y_test)
plt.show()

In [28]:

# 通过StratifiedShuffleSplit实现数据分割

from sklearn.model_selection import StratifiedShuffleSplit

sss = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=0)

for train_index, test_index in sss.split(X_resampled.values, y_resampled):
    print(len(train_index))
    print(len(test_index))
    
    x_train = X_resampled.values[train_index]
    x_val = X_resampled.values[test_index]
    
    y_train = y_resampled[train_index]
    y_val = y_resampled[test_index]

13888
3473

In [29]:

print(x_train.shape, x_val.shape)

(13888, 93) (3473, 93)

In [30]:

# 图形可视化
import seaborn as sns

sns.countplot(y_val)
plt.show()

数据标准化

In [31]:

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(x_train)

x_train_scaled = scaler.transform(x_train)
x_val_scaled = scaler.transform(x_val)

数据PCA降维

In [33]:

x_train_scaled.shape

Out[33]:

(13888, 93)

In [34]:

from sklearn.decomposition import PCA

pca = PCA(n_components=0.9)

x_train_pca = pca.fit_transform(x_train_scaled)
x_val_pca = pca.transform(x_val_scaled)

In [35]:

print(x_train_pca.shape, x_val_pca.shape)

(13888, 65) (3473, 65)

In [37]:

# 可视化数据降维信息变化程度
plt.plot(np.cumsum(pca.explained_variance_ratio_))

plt.xlabel("元素数量")
plt.ylabel("表达信息百分占比")

plt.show()

模型训练

基本模型训练

In [38]:

from xgboost import XGBClassifier

xgb = XGBClassifier()
xgb.fit(x_train_pca, y_train)

Out[38]:

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0,
              learning_rate=0.1, max_delta_step=0, max_depth=3,
              min_child_weight=1, missing=None, n_estimators=100, n_jobs=1,
              nthread=None, objective='multi:softprob', random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
              silent=None, subsample=1, verbosity=1)

In [39]:

# 输出预测值,一定输出带有百分占比的预测值
y_pre_proba = xgb.predict_proba(x_val_pca)

In [40]:

y_pre_proba

Out[40]:

array([[0.4893983 , 0.00375719, 0.00225278, ..., 0.06179977, 0.17131925,
        0.03980364],
       [0.14336601, 0.01110009, 0.01018962, ..., 0.00691424, 0.02062171,
        0.7525783 ],
       [0.00834821, 0.14602502, 0.65013766, ..., 0.01385602, 0.00602207,
        0.00240582],
       ...,
       [0.09568001, 0.00293341, 0.00582061, ..., 0.1031019 , 0.7587154 ,
        0.02730099],
       [0.40236628, 0.12317444, 0.03567632, ..., 0.18818544, 0.13276173,
        0.07105519],
       [0.00473167, 0.01536749, 0.02546864, ..., 0.00882399, 0.88531935,
        0.00384397]], dtype=float32)

In [42]:

# logloss评估
from sklearn.metrics import log_loss

log_loss(y_val, y_pre_proba, eps=1e-15, normalize=True)

Out[42]:

0.7845457684689274

In [43]:

xgb.get_params

Out[43]:

<bound method XGBModel.get_params of XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0,
              learning_rate=0.1, max_delta_step=0, max_depth=3,
              min_child_weight=1, missing=None, n_estimators=100, n_jobs=1,
              nthread=None, objective='multi:softprob', random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
              silent=None, subsample=1, verbosity=1)>

模型调优

确定最优的estimators

In [44]:

scores_ne = []
n_estimators = [100, 200, 300, 400, 500, 550, 600, 700]

In [49]:

for nes in n_estimators:
    print("n_estimators:", nes)
    xgb = XGBClassifier(max_depth=3,
                        learning_rate=0.1, 
                        n_estimators=nes, 
                        objective="multi:softprob", 
                        n_jobs=-1, 
                        nthread=4, 
                        min_child_weight=1,
                        subsample=1,
                        colsample_bytree=1,
                        seed=42)
    
    xgb.fit(x_train_pca, y_train)
    y_pre = xgb.predict_proba(x_val_pca)
    score = log_loss(y_val, y_pre)
    scores_ne.append(score)
    
    print("每次测试的logloss值是:{}".format(score))

n_estimators: 100
每次测试的logloss值是:0.7845457684689274
n_estimators: 200
每次测试的logloss值是:0.7163659085830947
n_estimators: 300
每次测试的logloss值是:0.6933389946023942
n_estimators: 400
每次测试的logloss值是:0.68119252278615
n_estimators: 500
每次测试的logloss值是:0.67700775120196
n_estimators: 550
每次测试的logloss值是:0.6756911007299885
n_estimators: 600
每次测试的logloss值是:0.6757532660164814
n_estimators: 700
每次测试的logloss值是:0.6778721089881976

In [50]:

# 图形化展示相应的logloss值
plt.plot(n_estimators, scores_ne, "o-")

plt.xlabel("n_estimators")
plt.ylabel("log_loss")
plt.show()

print("最优的n_estimators值是:{}".format(n_estimators[np.argmin(scores_ne)]))

最优的n_estimators值是:550

确定最优的max_depth

In [63]:

scores_md = []
max_depths = [1,3,5,6,7]

In [64]:

for md in max_depths:
    print("max_depth:", md)
    xgb = XGBClassifier(max_depth=md,
                        learning_rate=0.1, 
                        n_estimators=n_estimators[np.argmin(scores_ne)], 
                        objective="multi:softprob", 
                        n_jobs=-1, 
                        nthread=4, 
                        min_child_weight=1,
                        subsample=1,
                        colsample_bytree=1,
                        seed=42)
    
    xgb.fit(x_train_pca, y_train)
    y_pre = xgb.predict_proba(x_val_pca)
    score = log_loss(y_val, y_pre)
    scores_md.append(score)
    
    print("每次测试的logloss值是:{}".format(score))

max_depth: 1
每次测试的logloss值是:0.8186777106711784
max_depth: 3
每次测试的logloss值是:0.6756911007299885
max_depth: 5
每次测试的logloss值是:0.730323661087053
max_depth: 6
每次测试的logloss值是:0.7693314501840949
max_depth: 7
每次测试的logloss值是:0.7889236364892144

In [67]:

# 图形化展示相应的logloss值
plt.plot(max_depths, scores_md, "o-")

plt.xlabel("max_depths")
plt.ylabel("log_loss")
plt.show()

print("最优的max_depths值是:{}".format(max_depths[np.argmin(scores_md)]))

最优的max_depths值是:3

依据上面模式,运行调试下面参数

min_child_weights,

subsamples,

consample_bytrees,

etas

In [69]:

xgb = XGBClassifier(learning_rate =0.1, 
                    n_estimators=550, 
                    max_depth=3, 
                    min_child_weight=3, 
                    subsample=0.7, 
                    colsample_bytree=0.7, 
                    nthread=4, 
                    seed=42, 
                    objective='multi:softprob')

xgb.fit(x_train_scaled, y_train)

y_pre = xgb.predict_proba(x_val_scaled)

print("测试数据的log_loss值为 : {}".format(log_loss(y_val, y_pre, eps=1e-15, normalize=True)))

测试数据的log_loss值为 : 0.5944022517380477

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

あずにゃん

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值