一、K - 近邻算法概述
K-近邻算法采用测量不同特征值之间的距离
方法来进行分类。
K - 近邻算法
优点:精度高,对异常值不敏感。
缺点:计算复杂度高、空间复杂度高。
适用数据范围:数值型和标称型。
二、工作原理:
- 存在一个样本数据集合,也被称作训练样本集合。样本中的每个数据都存在标签,即我们知道样本集中每一数据的所属分类的对应关系。再输入没有标签的样本后,将新数据的每个特征与样本集中包含的数据对应的特征进行性比较,然后提取与样本集合中特征最相似(最邻近)的分类标签。
三、K-近邻算法的伪代码
对已知类别属性的数据集中的数据的每个点依次执行以下操作
1. 计算已知类别数据集中的点与当前点的距离
2. 按照距离递增依次排序
3. 选取与当前点距离最小的K个点
4. 确定前K歌点所在类别出现的频率
5. 返回前k个点出现频率最高的类别作为当前点的预测分类
四、实例
4.1 实例1:K近邻算法的简单实现(判断坐标的标签)
1、准备:使用python导入数据
from numpy import *
import operator
def createDataSet():
group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
labels = ['A','A','B','B']
return group, labels
调用函数 createDataSet() 封装数据
group,labels = createDataSet()
程序运行结果:
group
array([[1. , 1.1],
[1. , 1. ],
[0. , 0. ],
[0. , 0.1]])
labels
['A', 'A', 'B', 'B']
2、从文本文件中解析数据
K-近邻算法
# 4个输入参数:
# 用于分类的输入向量是inX
# 输入的训练样本集为dataSet,
# 标签向量为labels
# 最后的参数 k 表示用于选择最近邻居的数目
# 要求:
# 标签向量的元素数目和矩阵dataSet的行数相同
def classify0(inX, dataSet, labels, k):
dataSetSize = dataSet.shape[0]
diffMat = tile(inX, (dataSetSize,1)) - dataSet # tile numpy中 重复某个数组n次,后面也可以数元组n行m列 //相减
sqDiffMat = diffMat**2 # 平方
sqDistances = sqDiffMat.sum(axis=1) # 对每一行求和,每两个向量间
distances = sqDistances**0.5 # 开平方
sortedDistIndicies = distances.argsort() # 排序返回数值从小到的的索引
classCount={}
for i in range(k):
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1
sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
return sortedClassCount[0][0]
- 计算任意两个向量之间距离的公式为欧氏距离(根号下平方和)计算完所有点之间的距离后,可以对数据按照从小到大的次序排序。然后,确定前k个距离最小元素所在的主要分类,输入k总是正整数;最后,将classCount字典分解为元组列表,然后使用程序第二行导入运算符模块的itemgetter方法,按照第二个元素的次序对元组进行排序。此处的排序为逆序,即按照从最大到最小次序排序,最后返回发生频率最高的元素标签。
result = classify0([1,1.3],group,labels,3)
result
-
运算结果:该坐标属于A分类
‘A’
4.2电影分类实例(爱情片与动作片)
- 样本数据集合:训练样本集,样本个数N
- 未知类型的数据集合一个(需要分类的目标数据);需要判断与每个(N个)样本集合的距离,排序过后选出最近的K个,选择K个数据中出现次数最多的分类(电影类型),我们根据这个原则去判定我们的数据属于那种分类。
1. 距离如何计算
- 欧氏距离
- 对于两个向量点a1 和 a2之间的距离,可以通过两点间距离表示
- 例如输入变量有4个特征(1,2,4,6),(2,3,5,4)之间的距离表示为
- (1-2)2+(2-3)2+(4-5)2+(6-4)2开根号
2.算法实现
- 这里使用python的字典dict构建数据集,然后再将其转换为DataFrame格式
import pandas as pd
rowdata={'电影名称':['无问西东','后来的我们','前任3','红海行动','唐人街探案','战狼2'],
'打斗镜头':[1,5,12,108,112,115],
'接吻镜头':[101,89,97,5,9,8],
'电影类型':['爱情片','爱情片','爱情片','动作片','动作片','动作片']}
movie_data= pd.DataFrame(rowdata)
movie_data
电影名称 | 打斗镜头 | 接吻镜头 | 电影类型 | |
---|---|---|---|---|
0 | 无问西东 | 1 | 101 | 爱情片 |
1 | 后来的我们 | 5 | 89 | 爱情片 |
2 | 前任3 | 12 | 97 | 爱情片 |
3 | 红海行动 | 108 | 5 | 动作片 |
4 | 唐人街探案 | 112 | 9 | 动作片 |
5 | 战狼2 | 115 | 8 | 动作片 |
# 计算已知类别数据集中的点与当前点之间的距离
new_data = [24,67] # 新电影的坐标
dist = list((((movie_data.iloc[:6,1:3]-new_data)**2).sum(1))**0.5)
movie_data.iloc[:6,1:3]
打斗镜头 | 接吻镜头 | |
---|---|---|
0 | -23 | 34 |
1 | -19 | 22 |
2 | -12 | 30 |
3 | 84 | -62 |
4 | 88 | -58 |
5 | 91 | -59 |
(movie_data.iloc[:6,1:3]-new_data)
打斗镜头 | 接吻镜头 | |
---|---|---|
0 | -23 | 34 |
1 | -19 | 22 |
2 | -12 | 30 |
3 | 84 | -62 |
4 | 88 | -58 |
5 | 91 | -59 |
(((movie_data.iloc[:6,1:3]-new_data)**2).sum(1)) # 对每一行进行操作
0 1685
1 845
2 1044
3 10900
4 11108
5 11762
dtype: int64
list((((movie_data.iloc[:6,1:3]-new_data)**2).sum(1))**0.5) # 先开根号然后转化为列表的形式
[41.048751503547585,
29.068883707497267,
32.31098884280702,
104.4030650891055,
105.39449701004318,
108.45275469069469]
# 将距离降序排列,然后选取最近的K个点
k = 4
dist_l = pd.DataFrame({'dist': dist, 'labels': (movie_data.iloc[:6, 3])})
dr = dist_l.sort_values(by = 'dist')[: k]
pd.DataFrame({'dist': dist, 'labels': (movie_data.iloc[:6, 3])}) # 组成字典形式的数据装化为DataFrame格式的数据
dist | labels | |
---|---|---|
0 | 41.048752 | 爱情片 |
1 | 29.068884 | 爱情片 |
2 | 32.310989 | 爱情片 |
3 | 104.403065 | 动作片 |
4 | 105.394497 | 动作片 |
5 | 108.452755 | 动作片 |
dist_l.sort_values(by = 'dist')[: k] # 对dist_1的数据按dist值进行排序默认升序;切片取前四个
dist | labels | |
---|---|---|
1 | 29.068884 | 爱情片 |
2 | 32.310989 | 爱情片 |
0 | 41.048752 | 爱情片 |
3 | 104.403065 | 动作片 |
# 确定前k个点所在类别的出现频率
re = dr.loc[:,'labels'].value_counts()# DataFrame 不能直接切片,可以通过loc来做切片;loc是基于标签名的索引,也就是我们自定义的索引名
re.index[0]
运算结果:
'爱情片'
result = []
result.append(re.index[0])
result
['爱情片']
* 封装函数
import pandas as pd
"""
函数功能:KNN分类器
参数说明:
inX:需要预测分类的数据集
dataSet:已知分类标签的数据集(训练集)
k:k-近邻算法参数,选择距离最小的k个点
返回:
result:分类结果
"""
def classify0(inX,dataSet,k):
result=[]
dist = list((((movie_data.iloc[:6,1:3]-new_data)**2).sum(1))**0.5)
dist_l = pd.DataFrame({'dist': dist, 'labels': (movie_data.iloc[:6, 3])})
dr = dist_l.sort_values(by = 'dist')[: k]
re = dr.loc[:,'labels'].value_counts()
result.append(re.index[0])
return result
* 加载数据
inX = new_data
dataSet = movie_data
k= 4
* 使用算法
classify0(inX,dataSet,k)
运算结果:
['爱情片']
4.3约会网站配对效果判定实例
# 导入数据
datingTest = pd.read_table('datingTestSet.txt',header=None)
datingTest.head()
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 40920 | 8.326976 | 0.953952 | largeDoses |
1 | 14488 | 7.153469 | 1.673904 | smallDoses |
2 | 26052 | 1.441871 | 0.805124 | didntLike |
3 | 75136 | 13.147394 | 0.428964 | didntLike |
4 | 38344 | 1.669788 | 0.134296 | didntLike |
datingTest.shape # 数据的形状
(1000, 4)
datingTest.info() # 数据的信息
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 4 columns):
0 1000 non-null int64
1 1000 non-null float64
2 1000 non-null float64
3 1000 non-null object
dtypes: float64(2), int64(1), object(1)
memory usage: 31.4+ KB
分析数据
# %matplotlib具体作用是当你调用matplotlib.pyplot的绘图函数plot()进行绘图的时候,或者生成一个figure画布的时候,可以直接在你的python console里面生成图像。
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
datingTest.shape[0] ## 获取行数
1000
# iloc函数:通过行号来取行数据(如取第二行的数据)
# loc函数:通过行索引 "Index" 中的具体值来取行数据
datingTest.iloc[1,-1] # 从第2行的最后一个数据
'smallDoses'
datingTest.iloc[:,1] # 所有行的二列数据
0 8.326976
1 7.153469
2 1.441871
3 13.147394
4 1.669788
...
995 3.410627
996 9.974715
997 10.650102
998 9.134528
999 7.882601
Name: 1, Length: 1000, dtype: float64
#把不同标签用颜色区分
Colors = []
for i in range(datingTest.shape[0]):
m = datingTest.iloc[i,-1]
if m=='didntLike':
Colors.append('black')
if m=='smallDoses':
Colors.append('orange')
if m=='largeDoses':
Colors.append('red')
#绘制两两特征之间的散点图
plt.rcParams['font.sans-serif']=['Simhei'] #图中字体设置为黑体
pl=plt.figure(figsize=(12,8))
fig1=pl.add_subplot(221)
plt.scatter(datingTest.iloc[:,1],datingTest.iloc[:,2],marker='.',c=Colors)
plt.xlabel('玩游戏视频所占时间比')
plt.ylabel('每周消费冰淇淋公升数')
fig2=pl.add_subplot(222)
plt.scatter(datingTest.iloc[:,0],datingTest.iloc[:,1],marker='.',c=Colors)
plt.xlabel('每年飞行常客里程')
plt.ylabel('玩游戏视频所占时间比')
fig3=pl.add_subplot(223)
plt.scatter(datingTest.iloc[:,0],datingTest.iloc[:,2],marker='.',c=Colors)
plt.xlabel('每年飞行常客里程')
plt.ylabel('每周消费冰淇淋公升数')
plt.show()
# 数值归一化
def minmax(dataSet):
minDf = dataSet.min()
maxDf = dataSet.max()
normSet = (dataSet - minDf )/(maxDf - minDf)
return normSet
# 归一化后的数据与标签合并
datingT = pd.concat([minmax(datingTest.iloc[:, :3]), datingTest.iloc[:,3]], axis=1)
datingT.head()
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.448325 | 0.398051 | 0.562334 | largeDoses |
1 | 0.158733 | 0.341955 | 0.987244 | smallDoses |
2 | 0.285429 | 0.068925 | 0.474496 | didntLike |
3 | 0.823201 | 0.628480 | 0.252489 | didntLike |
4 | 0.420102 | 0.079820 | 0.078578 | didntLike |
"""
函数功能:切分训练集与数据集
参数说明:
dataSet:原始数据集
rate:训练集所占的比例
返回:切分好的训练集和数据集
"""
def randSplit(dataSet,rate=0.9):
n = dataSet.shape[0]
m = int(n*rate)
train = dataSet.iloc[:m,:]
test = dataSet.iloc[m:,:] # 数据集无规律,就没有随机选择
test.index = range(test.shape[0]) # 重新分配训练集的索引(从0开始)
return train,test
train,test = randSplit(datingT)
train
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.448325 | 0.398051 | 0.562334 | largeDoses |
1 | 0.158733 | 0.341955 | 0.987244 | smallDoses |
2 | 0.285429 | 0.068925 | 0.474496 | didntLike |
3 | 0.823201 | 0.628480 | 0.252489 | didntLike |
4 | 0.420102 | 0.079820 | 0.078578 | didntLike |
5 | 0.799722 | 0.484802 | 0.608961 | didntLike |
6 | 0.393851 | 0.326530 | 0.715335 | largeDoses |
7 | 0.467455 | 0.634645 | 0.320312 | largeDoses |
8 | 0.739507 | 0.412612 | 0.441536 | didntLike |
9 | 0.388757 | 0.586690 | 0.889360 | largeDoses |
10 | 0.550459 | 0.177993 | 0.490309 | didntLike |
11 | 0.693250 | 0.400867 | 0.984636 | didntLike |
12 | 0.061015 | 0.233059 | 0.429367 | smallDoses |
13 | 0.559333 | 0.223721 | 0.368321 | didntLike |
14 | 0.847699 | 0.731360 | 0.194879 | didntLike |
15 | 0.478488 | 0.090321 | 0.112212 | didntLike |
16 | 0.672313 | 0.359321 | 0.748369 | didntLike |
17 | 0.763347 | 0.680671 | 0.153555 | didntLike |
18 | 0.171672 | 0.000000 | 0.737168 | smallDoses |
19 | 0.312119 | 0.503293 | 0.769428 | largeDoses |
20 | 0.071072 | 0.169234 | 0.484741 | smallDoses |
21 | 0.413134 | 0.143004 | 0.491491 | didntLike |
22 | 0.247828 | 0.253252 | 0.376041 | smallDoses |
23 | 0.315340 | 0.315201 | 0.109748 | largeDoses |
24 | 0.216263 | 0.134649 | 0.994506 | smallDoses |
25 | 0.403055 | 0.595538 | 0.382717 | largeDoses |
26 | 0.062899 | 0.000000 | 0.976924 | smallDoses |
27 | 0.312984 | 0.476528 | 0.430886 | largeDoses |
28 | 0.074589 | 0.065243 | 0.377102 | smallDoses |
29 | 0.455896 | 0.011016 | 0.679218 | didntLike |
... | ... | ... | ... | ... |
870 | 0.560253 | 0.087403 | 0.597598 | didntLike |
871 | 0.386335 | 0.483667 | 0.682090 | largeDoses |
872 | 0.468660 | 0.540643 | 0.050246 | largeDoses |
873 | 0.703286 | 0.398771 | 0.818841 | didntLike |
874 | 0.169119 | 0.011555 | 0.421646 | smallDoses |
875 | 0.157790 | 0.501097 | 0.999678 | smallDoses |
876 | 0.069473 | 0.444063 | 0.842632 | smallDoses |
877 | 0.154624 | 0.204089 | 0.078510 | smallDoses |
878 | 0.070010 | 0.000000 | 0.111133 | smallDoses |
879 | 0.096348 | 0.039060 | 0.084110 | smallDoses |
880 | 0.475847 | 0.072105 | 0.384508 | didntLike |
881 | 0.419993 | 0.447429 | 0.030162 | largeDoses |
882 | 0.373254 | 0.480528 | 0.324174 | largeDoses |
883 | 0.337657 | 0.531167 | 0.583112 | largeDoses |
884 | 0.243654 | 0.538543 | 0.426649 | largeDoses |
885 | 0.314715 | 0.496374 | 0.149720 | largeDoses |
886 | 0.625278 | 0.185406 | 0.812594 | didntLike |
887 | 0.793444 | 0.653904 | 0.014277 | didntLike |
888 | 0.309993 | 0.503211 | 0.460594 | largeDoses |
889 | 0.108422 | 0.000000 | 0.544773 | smallDoses |
890 | 0.721144 | 0.196312 | 0.640072 | didntLike |
891 | 0.083760 | 0.388103 | 0.867306 | smallDoses |
892 | 0.781052 | 0.372711 | 0.030206 | didntLike |
893 | 0.056183 | 0.133354 | 0.644440 | smallDoses |
894 | 0.150220 | 0.297665 | 0.168851 | smallDoses |
895 | 0.243665 | 0.486131 | 0.979099 | largeDoses |
896 | 0.165350 | 0.000000 | 0.808206 | smallDoses |
897 | 0.054967 | 0.359158 | 0.080380 | smallDoses |
898 | 0.111106 | 0.393932 | 0.058181 | smallDoses |
899 | 0.389710 | 0.698530 | 0.735519 | largeDoses |
900 rows × 4 columns
test
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 0.513766 | 0.170320 | 0.262181 | didntLike |
1 | 0.089599 | 0.154426 | 0.785277 | smallDoses |
2 | 0.611167 | 0.172689 | 0.915245 | didntLike |
3 | 0.012578 | 0.000000 | 0.195477 | smallDoses |
4 | 0.110241 | 0.187926 | 0.287082 | smallDoses |
5 | 0.812113 | 0.705201 | 0.681085 | didntLike |
6 | 0.729712 | 0.490545 | 0.960202 | didntLike |
7 | 0.130301 | 0.133239 | 0.926158 | smallDoses |
8 | 0.557755 | 0.722409 | 0.780811 | largeDoses |
9 | 0.437051 | 0.247835 | 0.131156 | largeDoses |
10 | 0.722174 | 0.184918 | 0.074908 | didntLike |
11 | 0.719578 | 0.167690 | 0.016377 | didntLike |
12 | 0.690193 | 0.526749 | 0.251657 | didntLike |
13 | 0.403745 | 0.182242 | 0.386039 | didntLike |
14 | 0.401751 | 0.528543 | 0.222839 | largeDoses |
15 | 0.425931 | 0.421948 | 0.590885 | largeDoses |
16 | 0.294479 | 0.534140 | 0.871767 | largeDoses |
17 | 0.506678 | 0.550039 | 0.248375 | largeDoses |
18 | 0.139811 | 0.372772 | 0.086617 | largeDoses |
19 | 0.386555 | 0.485440 | 0.807905 | largeDoses |
20 | 0.748370 | 0.508872 | 0.408589 | didntLike |
21 | 0.342511 | 0.461926 | 0.897321 | largeDoses |
22 | 0.380770 | 0.515810 | 0.774052 | largeDoses |
23 | 0.146900 | 0.134351 | 0.129138 | smallDoses |
24 | 0.332683 | 0.469709 | 0.818801 | largeDoses |
25 | 0.117329 | 0.067943 | 0.399234 | smallDoses |
26 | 0.266585 | 0.531719 | 0.476847 | largeDoses |
27 | 0.498691 | 0.640661 | 0.389745 | largeDoses |
28 | 0.067687 | 0.057949 | 0.493195 | smallDoses |
29 | 0.116562 | 0.074976 | 0.765075 | smallDoses |
... | ... | ... | ... | ... |
70 | 0.588465 | 0.580790 | 0.819148 | largeDoses |
71 | 0.705258 | 0.437379 | 0.515681 | didntLike |
72 | 0.101772 | 0.462088 | 0.808077 | smallDoses |
73 | 0.664085 | 0.173051 | 0.169156 | didntLike |
74 | 0.200914 | 0.250428 | 0.739211 | smallDoses |
75 | 0.250293 | 0.703453 | 0.886825 | largeDoses |
76 | 0.818161 | 0.690544 | 0.714136 | didntLike |
77 | 0.374076 | 0.650571 | 0.214290 | largeDoses |
78 | 0.155062 | 0.150176 | 0.249725 | smallDoses |
79 | 0.102188 | 0.000000 | 0.070700 | smallDoses |
80 | 0.208068 | 0.021738 | 0.609152 | smallDoses |
81 | 0.100720 | 0.024394 | 0.008994 | smallDoses |
82 | 0.025035 | 0.184718 | 0.363083 | smallDoses |
83 | 0.104007 | 0.321426 | 0.331622 | smallDoses |
84 | 0.025977 | 0.205043 | 0.006732 | smallDoses |
85 | 0.152981 | 0.000000 | 0.847443 | smallDoses |
86 | 0.025188 | 0.178477 | 0.411431 | smallDoses |
87 | 0.057651 | 0.095729 | 0.813893 | smallDoses |
88 | 0.051045 | 0.119632 | 0.108045 | smallDoses |
89 | 0.192631 | 0.305083 | 0.516670 | smallDoses |
90 | 0.304033 | 0.408557 | 0.075279 | largeDoses |
91 | 0.108115 | 0.128827 | 0.254764 | smallDoses |
92 | 0.200859 | 0.188880 | 0.196029 | smallDoses |
93 | 0.041414 | 0.471152 | 0.193598 | smallDoses |
94 | 0.199292 | 0.098902 | 0.253058 | smallDoses |
95 | 0.122106 | 0.163037 | 0.372224 | smallDoses |
96 | 0.754287 | 0.476818 | 0.394621 | didntLike |
97 | 0.291159 | 0.509103 | 0.510795 | largeDoses |
98 | 0.527111 | 0.436655 | 0.429005 | largeDoses |
99 | 0.479408 | 0.376809 | 0.785718 | largeDoses |
100 rows × 4 columns
"""
函数功能:分类器
"""
def datingClass(train,test,k):
= train.shape[1] - 1 # 取出训练集(原始数据)标签外的所有列
m = test.shape[0] # 获取测试集的个数
result = [] # 存放结果
for i in range(m):
dist = list((((train.iloc[:, :n] - test.iloc[i, :n]) ** 2).sum(1))**5)
dist_l = pd.DataFrame({'dist': dist, 'labels': (train.iloc[:, n])})
dr = dist_l.sort_values(by = 'dist')[: k]
re = dr.loc[:, 'labels'].value_counts()
result.append(re.index[0]) # 加入result
result = pd.Series(result) # 吧列表 result 结果转换格式为 Series格式
test['predict'] = result # 追加到测试集 变成 DataFrame 格式增加新的一列
acc = (test.iloc[:,-1]==test.iloc[:,-2]).mean() # 确认准确率
print(f'模型预测准确率为{acc}')
return test
datingClass(train,test,5)
模型预测准确率为0.95
f:\python3\lib\site-packages\ipykernel_launcher.py:12: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
if sys.path[0] == '':
0 | 1 | 2 | 3 | predict | |
---|---|---|---|---|---|
0 | 0.513766 | 0.170320 | 0.262181 | didntLike | didntLike |
1 | 0.089599 | 0.154426 | 0.785277 | smallDoses | smallDoses |
2 | 0.611167 | 0.172689 | 0.915245 | didntLike | didntLike |
3 | 0.012578 | 0.000000 | 0.195477 | smallDoses | smallDoses |
4 | 0.110241 | 0.187926 | 0.287082 | smallDoses | smallDoses |
5 | 0.812113 | 0.705201 | 0.681085 | didntLike | didntLike |
6 | 0.729712 | 0.490545 | 0.960202 | didntLike | didntLike |
7 | 0.130301 | 0.133239 | 0.926158 | smallDoses | smallDoses |
8 | 0.557755 | 0.722409 | 0.780811 | largeDoses | largeDoses |
9 | 0.437051 | 0.247835 | 0.131156 | largeDoses | didntLike |
10 | 0.722174 | 0.184918 | 0.074908 | didntLike | didntLike |
11 | 0.719578 | 0.167690 | 0.016377 | didntLike | didntLike |
12 | 0.690193 | 0.526749 | 0.251657 | didntLike | didntLike |
13 | 0.403745 | 0.182242 | 0.386039 | didntLike | didntLike |
14 | 0.401751 | 0.528543 | 0.222839 | largeDoses | largeDoses |
15 | 0.425931 | 0.421948 | 0.590885 | largeDoses | largeDoses |
16 | 0.294479 | 0.534140 | 0.871767 | largeDoses | largeDoses |
17 | 0.506678 | 0.550039 | 0.248375 | largeDoses | largeDoses |
18 | 0.139811 | 0.372772 | 0.086617 | largeDoses | smallDoses |
19 | 0.386555 | 0.485440 | 0.807905 | largeDoses | largeDoses |
20 | 0.748370 | 0.508872 | 0.408589 | didntLike | didntLike |
21 | 0.342511 | 0.461926 | 0.897321 | largeDoses | largeDoses |
22 | 0.380770 | 0.515810 | 0.774052 | largeDoses | largeDoses |
23 | 0.146900 | 0.134351 | 0.129138 | smallDoses | smallDoses |
24 | 0.332683 | 0.469709 | 0.818801 | largeDoses | largeDoses |
25 | 0.117329 | 0.067943 | 0.399234 | smallDoses | smallDoses |
26 | 0.266585 | 0.531719 | 0.476847 | largeDoses | largeDoses |
27 | 0.498691 | 0.640661 | 0.389745 | largeDoses | largeDoses |
28 | 0.067687 | 0.057949 | 0.493195 | smallDoses | smallDoses |
29 | 0.116562 | 0.074976 | 0.765075 | smallDoses | smallDoses |
... | ... | ... | ... | ... | ... |
70 | 0.588465 | 0.580790 | 0.819148 | largeDoses | largeDoses |
71 | 0.705258 | 0.437379 | 0.515681 | didntLike | didntLike |
72 | 0.101772 | 0.462088 | 0.808077 | smallDoses | smallDoses |
73 | 0.664085 | 0.173051 | 0.169156 | didntLike | didntLike |
74 | 0.200914 | 0.250428 | 0.739211 | smallDoses | smallDoses |
75 | 0.250293 | 0.703453 | 0.886825 | largeDoses | largeDoses |
76 | 0.818161 | 0.690544 | 0.714136 | didntLike | didntLike |
77 | 0.374076 | 0.650571 | 0.214290 | largeDoses | largeDoses |
78 | 0.155062 | 0.150176 | 0.249725 | smallDoses | smallDoses |
79 | 0.102188 | 0.000000 | 0.070700 | smallDoses | smallDoses |
80 | 0.208068 | 0.021738 | 0.609152 | smallDoses | smallDoses |
81 | 0.100720 | 0.024394 | 0.008994 | smallDoses | smallDoses |
82 | 0.025035 | 0.184718 | 0.363083 | smallDoses | smallDoses |
83 | 0.104007 | 0.321426 | 0.331622 | smallDoses | smallDoses |
84 | 0.025977 | 0.205043 | 0.006732 | smallDoses | smallDoses |
85 | 0.152981 | 0.000000 | 0.847443 | smallDoses | smallDoses |
86 | 0.025188 | 0.178477 | 0.411431 | smallDoses | smallDoses |
87 | 0.057651 | 0.095729 | 0.813893 | smallDoses | smallDoses |
88 | 0.051045 | 0.119632 | 0.108045 | smallDoses | smallDoses |
89 | 0.192631 | 0.305083 | 0.516670 | smallDoses | smallDoses |
90 | 0.304033 | 0.408557 | 0.075279 | largeDoses | largeDoses |
91 | 0.108115 | 0.128827 | 0.254764 | smallDoses | smallDoses |
92 | 0.200859 | 0.188880 | 0.196029 | smallDoses | smallDoses |
93 | 0.041414 | 0.471152 | 0.193598 | smallDoses | smallDoses |
94 | 0.199292 | 0.098902 | 0.253058 | smallDoses | smallDoses |
95 | 0.122106 | 0.163037 | 0.372224 | smallDoses | smallDoses |
96 | 0.754287 | 0.476818 | 0.394621 | didntLike | didntLike |
97 | 0.291159 | 0.509103 | 0.510795 | largeDoses | largeDoses |
98 | 0.527111 | 0.436655 | 0.429005 | largeDoses | largeDoses |
99 | 0.479408 | 0.376809 | 0.785718 | largeDoses | largeDoses |
100 rows × 5 columns