贝叶斯分类之旧金山犯罪类型分类预测

贝叶斯分类之旧金山犯罪类型分类预测

学习七月算法朴素贝叶斯分类器中项目的一个例子,这也是一个Kaggle比赛的例子。通过训练来预测犯罪类型。

环境: win7 64位 python3.5

1、加载数据

该数据是旧金山12年的犯罪记录,数据文件是一个csv文件可以使用pandas来加载数据,数据内容摘录:
Dates,Category,Descript,DayOfWeek,PdDistrict,Resolution,Address,X,Y
2015-05-13 23:53:00,WARRANTS,WARRANT ARREST,Wednesday,NORTHERN,”ARREST, BOOKED”,OAK ST / LAGUNA ST,-122.425891675136,37.7745985956747

2015-05-13 23:53:00,OTHER OFFENSES,TRAFFIC VIOLATION ARREST,Wednesday,NORTHERN,”ARREST, BOOKED”,OAK ST / LAGUNA ST,-122.425891675136,37.7745985956747

2015-05-13 23:33:00,OTHER OFFENSES,TRAFFIC VIOLATION ARREST,Wednesday,NORTHERN,”ARREST, BOOKED”,VANNESS AV / GREENWICH ST,-122.42436302145,37.8004143219856

2015-05-13 23:30:00,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Wednesday,NORTHERN,NONE,1500 Block of LOMBARD ST,-122.42699532676599,37.80087263276921

2015-05-13 23:30:00,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Wednesday,PARK,NONE,100 Block of BRODERICK ST,-122.438737622757,37.771541172057795

2015-05-13 23:30:00,LARCENY/THEFT,GRAND THEFT FROM UNLOCKED AUTO,Wednesday,INGLESIDE,NONE,0 Block of TEDDY AV,-122.40325236121201,37.713430704116

从上面的摘录可以看出有一下特征

Dates:犯罪的日期

Category:犯罪类型

Descript:犯罪描述

DayOfWeek:星期几

PdDistrict:所属警区

Resolution:处理结果

Address:发生街区

X and Y:GPS坐标

import pandas as pd
import numpy as np

train = pd.read_csv("C:\\data\\SanFrancisco\\train.csv",parse_dates=['Dates'])
test = pd.read_csv("C:\\data\\SanFrancisco\\test.csv",parse_dates=['Dates'])
   
   
  • 1
  • 2
  • 3
  • 4
  • 5
train[0:6]
   
   
  • 1
                Dates        Category                        Descript  \
0 2015-05-13 23:53:00        WARRANTS                  WARRANT ARREST   
1 2015-05-13 23:53:00  OTHER OFFENSES        TRAFFIC VIOLATION ARREST   
2 2015-05-13 23:33:00  OTHER OFFENSES        TRAFFIC VIOLATION ARREST   
3 2015-05-13 23:30:00   LARCENY/THEFT    GRAND THEFT FROM LOCKED AUTO   
4 2015-05-13 23:30:00   LARCENY/THEFT    GRAND THEFT FROM LOCKED AUTO   
5 2015-05-13 23:30:00   LARCENY/THEFT  GRAND THEFT FROM UNLOCKED AUTO   

   DayOfWeek PdDistrict      Resolution                    Address  \
0  Wednesday   NORTHERN  ARREST, BOOKED         OAK ST / LAGUNA ST   
1  Wednesday   NORTHERN  ARREST, BOOKED         OAK ST / LAGUNA ST   
2  Wednesday   NORTHERN  ARREST, BOOKED  VANNESS AV / GREENWICH ST   
3  Wednesday   NORTHERN            NONE   1500 Block of LOMBARD ST   
4  Wednesday       PARK            NONE  100 Block of BRODERICK ST   
5  Wednesday  INGLESIDE            NONE        0 Block of TEDDY AV   

            X          Y  
0 -122.425892  37.774599  
1 -122.425892  37.774599  
2 -122.424363  37.800414  
3 -122.426995  37.800873  
4 -122.438738  37.771541  
5 -122.403252  37.713431  

   
   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24

2、特征预处理

上述数据中类别和文本类型非常多,所以要进行特征处理。因为我们要预测的是犯罪类型,
所以要尽可能的将犯罪相关因素的特征量化。

日期Dates:前5条记录发现几乎犯罪时间都是23点以后,这也符合常理。

犯罪类型Category:这个target,是需要量化的。

罪状Descript:这个特征都是犯罪以后的事了,没什么意义。

星期几DayOfWeek:这个与时间Dates关系还是挺强的,毕竟周末或者节假日户外活动的人多的话,也很容易招贼。

所属警区PdDistrict和处理结果Resolution:这两个特征与犯罪动因也没什么太大关系。

发生街区位置Address:对美国街区有一定了解的话,就知道美国有一些街区比如是低收入、非法移民等聚居的街区治安不是太好,犯罪比例也相对比较高。

接下来将对日期、犯罪类型、星期几、街区等特征进行预处理。

使用pandas的get_dummies()可以直接拿到一个二值化的01向量

使用pandas的LabelEncoder可以对类别编号

import pandas as pd
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn import preprocessing

# pd.set_option('display.notebook_repr_html',False)
# pd.set_option('display.max_columns',None)
# pd.set_option('display.max_rows',150)   
# pd.set_option('display.max_seq_items',None)

#用LabelEncoder对不同的犯罪类型编号
leCrime = preprocessing.LabelEncoder()
crime = leCrime.fit_transform(train.Category)

#因子化星期几,街区,小时等特征
days = pd.get_dummies(train.DayOfWeek)
district = pd.get_dummies(train.PdDistrict)
hour = train.Dates.dt.hour
hour = pd.get_dummies(hour) 

#组合特征
trainData = pd.concat([hour, days, district], axis=1)
trainData['crime']=crime

#对于测试数据做同样的处理
days = pd.get_dummies(test.DayOfWeek)
district = pd.get_dummies(test.PdDistrict)

hour = test.Dates.dt.hour
hour = pd.get_dummies(hour) 

testData = pd.concat([hour, days, district], axis=1)
trainData
   
   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
        0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  \
0       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
1       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
2       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
3       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
4       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
5       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
6       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
7       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
8       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
9       0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
10      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
11      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
12      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
13      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
14      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
15      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
16      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
17      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
18      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
19      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
20      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
21      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
22      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
23      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
24      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
25      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
26      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
27      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
28      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
29      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
30      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
31      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
32      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
33      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
34      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
35      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
36      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
37      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
38      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
39      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
40      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
41      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
42      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
43      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
44      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
45      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
46      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
47      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
48      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
49      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
50      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
51      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
52      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
53      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
54      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
55      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
56      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
57      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
58      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
59      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
60      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
61      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
62      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
63      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
64      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
65      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
66      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
67      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
68      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
69      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
70      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
71      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
72      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
73      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
74      0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   1   
...    .. .. .. .. .. .. .. .. .. ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..   
877974  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877975  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877976  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877977  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877978  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877979  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877980  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877981  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877982  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877983  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877984  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877985  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877986  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877987  0  0  0  0  0  0  0  0  1  0   0   0   0   0   0   0   0   0   0   0   
877988  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877989  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877990  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877991  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877992  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877993  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877994  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877995  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877996  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877997  0  0  0  0  0  0  0  1  0  0   0   0   0   0   0   0   0   0   0   0   
877998  0  0  0  0  0  0  1  0  0  0   0   0   0   0   0   0   0   0   0   0   
877999  0  0  0  0  0  0  1  0  0  0   0   0   0   0   0   0   0   0   0   0   
878000  0  0  0  0  0  0  1  0  0  0   0   0   0   0   0   0   0   0   0   0   
878001  0  0  0  0  0  0  1  0  0  0   0   0   0   0   0   0   0   0   0   0   
878002  0  0  0  0  0  0  1  0  0  0   0   0   0   0   0   0   0   0   0   0   
878003  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878004  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878005  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878006  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878007  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878008  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878009  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878010  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878011  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878012  0  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878013  0  0  0  0  1  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878014  0  0  0  1  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878015  0  0  0  1  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878016  0  0  0  1  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878017  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878018  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878019  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878020  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878021  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878022  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878023  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878024  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878025  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878026  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878027  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878028  0  0  1  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878029  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878030  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878031  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878032  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878033  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878034  0  1  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878035  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878036  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878037  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878038  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878039  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878040  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878041  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878042  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878043  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878044  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878045  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878046  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878047  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   
878048  1  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   

        20  21  22  23  Friday  Monday  Saturday  Sunday  Thursday  Tuesday  \
0        0   0   0   1       0       0         0       0         0        0   
1        0   0   0   1       0       0         0       0         0        0   
2        0   0   0   1       0       0         0       0         0        0   
3        0   0   0   1       0       0         0       0         0        0   
4        0   0   0   1       0       0         0       0         0        0   
5        0   0   0   1       0       0         0       0         0        0   
6        0   0   0   1       0       0         0       0         0        0   
7        0   0   0   1       0       0         0       0         0        0   
8        0   0   0   1       0       0         0       0         0        0   
9        0   0   0   1       0       0         0       0         0        0   
10       0   0   1   0       0       0         0       0         0        0   
11       0   0   1   0       0       0         0       0         0        0   
12       0   0   1   0       0       0         0       0         0        0   
13       0   0   1   0       0       0         0       0         0        0   
14       0   0   1   0       0       0         0       0         0        0   
15       0   0   1   0       0       0         0       0         0        0   
16       0   0   1   0       0       0         0       0         0        0   
17       0   1   0   0       0       0         0       0         0        0   
18       0   1   0   0       0       0         0       0         0        0   
19       0   1   0   0       0       0         0       0         0        0   
20       0   1   0   0       0       0         0       0         0        0   
21       0   1   0   0       0       0         0       0         0        0   
22       0   1   0   0       0       0         0       0         0        0   
23       0   1   0   0       0       0         0       0         0        0   
24       0   1   0   0       0       0         0       0         0        0   
25       0   1   0   0       0       0         0       0         0        0   
26       0   1   0   0       0       0         0       0         0        0   
27       0   1   0   0       0       0         0       0         0        0   
28       0   1   0   0       0       0         0       0         0        0   
29       1   0   0   0       0       0         0       0         0        0   
30       1   0   0   0       0       0         0       0         0        0   
31       1   0   0   0       0       0         0       0         0        0   
32       1   0   0   0       0       0         0       0         0        0   
33       1   0   0   0       0       0         0       0         0        0   
34       1   0   0   0       0       0         0       0         0        0   
35       1   0   0   0       0       0         0       0         0        0   
36       1   0   0   0       0       0         0       0         0        0   
37       1   0   0   0       0       0         0       0         0        0   
38       1   0   0   0       0       0         0       0         0        0   
39       1   0   0   0       0       0         0       0         0        0   
40       1   0   0   0       0       0         0       0         0        0   
41       1   0   0   0       0       0         0       0         0        0   
42       1   0   0   0       0       0         0       0         0        0   
43       1   0   0   0       0       0         0       0         0        0   
44       1   0   0   0       0       0         0       0         0        0   
45       1   0   0   0       0       0         0       0         0        0   
46       1   0   0   0       0       0         0       0         0        0   
47       1   0   0   0       0       0         0       0         0        0   
48       0   0   0   0       0       0         0       0         0        0   
49       0   0   0   0       0       0         0       0         0        0   
50       0   0   0   0       0       0         0       0         0        0   
51       0   0   0   0       0       0         0       0         0        0   
52       0   0   0   0       0       0         0       0         0        0   
53       0   0   0   0       0       0         0       0         0        0   
54       0   0   0   0       0       0         0       0         0        0   
55       0   0   0   0       0       0         0       0         0        0   
56       0   0   0   0       0       0         0       0         0        0   
57       0   0   0   0       0       0         0       0         0        0   
58       0   0   0   0       0       0         0       0         0        0   
59       0   0   0   0       0       0         0       0         0        0   
60       0   0   0   0       0       0         0       0         0        0   
61       0   0   0   0       0       0         0       0         0        0   
62       0   0   0   0       0       0         0       0         0        0   
63       0   0   0   0       0       0         0       0         0        0   
64       0   0   0   0       0       0         0       0         0        0   
65       0   0   0   0       0       0         0       0         0        0   
66       0   0   0   0       0       0         0       0         0        0   
67       0   0   0   0       0       0         0       0         0        0   
68       0   0   0   0       0       0         0       0         0        0   
69       0   0   0   0       0       0         0       0         0        0   
70       0   0   0   0       0       0         0       0         0        0   
71       0   0   0   0       0       0         0       0         0        0   
72       0   0   0   0       0       0         0       0         0        0   
73       0   0   0   0       0       0         0       0         0        0   
74       0   0   0   0       0       0         0       0         0        0   
...     ..  ..  ..  ..     ...     ...       ...     ...       ...      ...   
877974   0   0   0   0       0       1         0       0         0        0   
877975   0   0   0   0       0       1         0       0         0        0   
877976   0   0   0   0       0       1         0       0         0        0   
877977   0   0   0   0       0       1         0       0         0        0   
877978   0   0   0   0       0       1         0       0         0        0   
877979   0   0   0   0       0       1         0       0         0        0   
877980   0   0   0   0       0       1         0       0         0        0   
877981   0   0   0   0       0       1         0       0         0        0   
877982   0   0   0   0       0       1         0       0         0        0   
877983   0   0   0   0       0       1         0       0         0        0   
877984   0   0   0   0       0       1         0       0         0        0   
877985   0   0   0   0       0       1         0       0         0        0   
877986   0   0   0   0       0       1         0       0         0        0   
877987   0   0   0   0       0       1         0       0         0        0   
877988   0   0   0   0       0       1         0       0         0        0   
877989   0   0   0   0       0       1         0       0         0        0   
877990   0   0   0   0       0       1         0       0         0        0   
877991   0   0   0   0       0       1         0       0         0        0   
877992   0   0   0   0       0       1         0       0         0        0   
877993   0   0   0   0       0       1         0       0         0        0   
877994   0   0   0   0       0       1         0       0         0        0   
877995   0   0   0   0       0       1         0       0         0        0   
877996   0   0   0   0       0       1         0       0         0        0   
877997   0   0   0   0       0       1         0       0         0        0   
877998   0   0   0   0       0       1         0       0         0        0   
877999   0   0   0   0       0       1         0       0         0        0   
878000   0   0   0   0       0       1         0       0         0        0   
878001   0   0   0   0       0       1         0       0         0        0   
878002   0   0   0   0       0       1         0       0         0        0   
878003   0   0   0   0       0       1         0       0         0        0   
878004   0   0   0   0       0       1         0       0         0        0   
878005   0   0   0   0       0       1         0       0         0        0   
878006   0   0   0   0       0       1         0       0         0        0   
878007   0   0   0   0       0       1         0       0         0        0   
878008   0   0   0   0       0       1         0       0         0        0   
878009   0   0   0   0       0       1         0       0         0        0   
878010   0   0   0   0       0       1         0       0         0        0   
878011   0   0   0   0       0       1         0       0         0        0   
878012   0   0   0   0       0       1         0       0         0        0   
878013   0   0   0   0       0       1         0       0         0        0   
878014   0   0   0   0       0       1         0       0         0        0   
878015   0   0   0   0       0       1         0       0         0        0   
878016   0   0   0   0       0       1         0       0         0        0   
878017   0   0   0   0       0       1         0       0         0        0   
878018   0   0   0   0       0       1         0       0         0        0   
878019   0   0   0   0       0       1         0       0         0        0   
878020   0   0   0   0       0       1         0       0         0        0   
878021   0   0   0   0       0       1         0       0         0        0   
878022   0   0   0   0       0       1         0       0         0        0   
878023   0   0   0   0       0       1         0       0         0        0   
878024   0   0   0   0       0       1         0       0         0        0   
878025   0   0   0   0       0       1         0       0         0        0   
878026   0   0   0   0       0       1         0       0         0        0   
878027   0   0   0   0       0       1         0       0         0        0   
878028   0   0   0   0       0       1         0       0         0        0   
878029   0   0   0   0       0       1         0       0         0        0   
878030   0   0   0   0       0       1         0       0         0        0   
878031   0   0   0   0       0       1         0       0         0        0   
878032   0   0   0   0       0       1         0       0         0        0   
878033   0   0   0   0       0       1         0       0         0        0   
878034   0   0   0   0       0       1         0       0         0        0   
878035   0   0   0   0       0       1         0       0         0        0   
878036   0   0   0   0       0       1         0       0         0        0   
878037   0   0   0   0       0       1         0       0         0        0   
878038   0   0   0   0       0       1         0       0         0        0   
878039   0   0   0   0       0       1         0       0         0        0   
878040   0   0   0   0       0       1         0       0         0        0   
878041   0   0   0   0       0       1         0       0         0        0   
878042   0   0   0   0       0       1         0       0         0        0   
878043   0   0   0   0       0       1         0       0         0        0   
878044   0   0   0   0       0       1         0       0         0        0   
878045   0   0   0   0       0       1         0       0         0        0   
878046   0   0   0   0       0       1         0       0         0        0   
878047   0   0   0   0       0       1         0       0         0        0   
878048   0   0   0   0       0       1         0       0         0        0   

        Wednesday  BAYVIEW  CENTRAL  INGLESIDE  MISSION  NORTHERN  PARK  \
0               1        0        0          0        0         1     0   
1               1        0        0          0        0         1     0   
2               1        0        0          0        0         1     0   
3               1        0        0          0        0         1     0   
4               1        0        0          0        0         0     1   
5               1        0        0          1        0         0     0   
6               1        0        0          1        0         0     0   
7               1        1        0          0        0         0     0   
8               1        0        0          0        0         0     0   
9               1        0        1          0        0         0     0   
10              1        0        1          0        0         0     0   
11              1        0        0          0        0         0     0   
12              1        0        0          0        0         0     0   
13              1        0        0          0        0         1     0   
14              1        1        0          0        0         0     0   
15              1        1        0          0        0         0     0   
16              1        0        0          0        0         0     0   
17              1        0        0          1        0         0     0   
18              1        1        0          0        0         0     0   
19              1        0        0          0        0         0     0   
20              1        0        0          1        0         0     0   
21              1        0        0          1        0         0     0   
22              1        0        0          0        0         0     0   
23              1        0        0          0        0         0     0   
24              1        0        0          0        0         1     0   
25              1        0        0          0        0         0     0   
26              1        0        0          0        0         1     0   
27              1        0        0          1        0         0     0   
28              1        0        0          0        0         0     0   
29              1        0        0          0        0         0     0   
30              1        0        0          0        0         1     0   
31              1        0        0          0        1         0     0   
32              1        0        0          0        0         1     0   
33              1        0        0          0        0         1     0   
34              1        0        0          0        0         1     0   
35              1        0        0          0        0         0     0   
36              1        0        0          0        0         1     0   
37              1        0        0          0        0         1     0   
38              1        0        0          0        0         0     0   
39              1        0        0          1        0         0     0   
40              1        0        0          0        0         0     0   
41              1        0        0          0        0         0     0   
42              1        0        0          0        0         0     0   
43              1        1        0          0        0         0     0   
44              1        1        0          0        0         0     0   
45              1        0        1          0        0         0     0   
46              1        0        0          1        0         0     0   
47              1        0        0          0        0         0     0   
48              1        0        1          0        0         0     0   
49              1        0        0          0        0         0     1   
50              1        1        0          0        0         0     0   
51              1        1        0          0        0         0     0   
52              1        0        0          0        0         0     0   
53              1        0        0          0        0         0     0   
54              1        0        0          0        0         0     0   
55              1        0        0          0        0         0     0   
56              1        0        0          0        0         1     0   
57              1        0        0          0        0         0     0   
58              1        0        0          0        0         1     0   
59              1        0        1          0        0         0     0   
60              1        0        1          0        0         0     0   
61              1        0        1          0        0         0     0   
62              1        0        1          0        0         0     0   
63              1        0        0          0        0         0     0   
64              1        0        0          0        0         0     0   
65              1        0        0          0        0         0     0   
66              1        0        0          0        0         0     0   
67              1        0        0          0        0         0     0   
68              1        0        0          0        0         0     0   
69              1        0        0          0        0         0     0   
70              1        0        0          0        0         0     0   
71              1        0        0          0        0         1     0   
72              1        1        0          0        0         0     0   
73              1        0        0          0        1         0     0   
74              1        0        1          0        0         0     0   
...           ...      ...      ...        ...      ...       ...   ...   
877974          0        0        0          0        0         0     1   
877975          0        0        0          0        0         0     1   
877976          0        0        1          0        0         0     0   
877977          0        0        0          0        0         0     0   
877978          0        0        0          0        0         0     0   
877979          0        0        0          0        0         0     0   
877980          0        0        0          0        0         0     0   
877981          0        0        0          0        0         1     0   
877982          0        0        0          0        0         0     0   
877983          0        0        0          0        1         0     0   
877984          0        0        1          0        0         0     0   
877985          0        0        0          0        0         0     0   
877986          0        1        0          0        0         0     0   
877987          0        0        0          1        0         0     0   
877988          0        0        0          0        0         0     0   
877989          0        1        0          0        0         0     0   
877990          0        0        0          0        0         1     0   
877991          0        0        0          0        0         0     0   
877992          0        0        0          0        0         0     1   
877993          0        0        0          0        0         0     0   
877994          0        0        0          1        0         0     0   
877995          0        0        0          0        1         0     0   
877996          0        0        0          0        1         0     0   
877997          0        1        0          0        0         0     0   
877998          0        0        0          0        0         0     1   
877999          0        1        0          0        0         0     0   
878000          0        1        0          0        0         0     0   
878001          0        0        0          0        0         0     0   
878002          0        0        0          0        0         0     0   
878003          0        0        1          0        0         0     0   
878004          0        0        0          0        0         1     0   
878005          0        0        0          0        0         0     0   
878006          0        0        0          0        0         0     0   
878007          0        0        0          0        0         0     0   
878008          0        0        0          1        0         0     0   
878009          0        0        0          1        0         0     0   
878010          0        0        0          0        0         0     0   
878011          0        0        0          0        0         1     0   
878012          0        0        0          0        0         0     0   
878013          0        0        0          0        0         0     0   
878014          0        0        0          0        0         1     0   
878015          0        0        0          0        0         1     0   
878016          0        1        0          0        0         0     0   
878017          0        0        1          0        0         0     0   
878018          0        0        1          0        0         0     0   
878019          0        0        0          0        0         0     0   
878020          0        0        0          0        0         1     0   
878021          0        0        0          0        0         1     0   
878022          0        0        0          0        1         0     0   
878023          0        0        0          0        0         0     0   
878024          0        0        0          0        0         0     1   
878025          0        1        0          0        0         0     0   
878026          0        1        0          0        0         0     0   
878027          0        0        0          0        0         0     0   
878028          0        0        0          0        0         0     0   
878029          0        0        0          0        0         0     0   
878030          0        0        0          0        0         0     0   
878031          0        1        0          0        0         0     0   
878032          0        0        0          0        0         1     0   
878033          0        0        0          0        0         0     0   
878034          0        0        0          0        0         0     0   
878035          0        0        0          0        0         1     0   
878036          0        0        0          0        0         1     0   
878037          0        0        0          0        0         1     0   
878038          0        0        0          0        0         0     0   
878039          0        0        0          0        0         1     0   
878040          0        0        0          0        1         0     0   
878041          0        0        0          0        0         0     0   
878042          0        1        0          0        0         0     0   
878043          0        1        0          0        0         0     0   
878044          0        0        0          0        0         0     0   
878045          0        0        0          1        0         0     0   
878046          0        0        0          0        0         0     0   
878047          0        0        0          0        0         0     0   
878048          0        1        0          0        0         0     0   

        RICHMOND  SOUTHERN  TARAVAL  TENDERLOIN  crime  
0              0         0        0           0     37  
1              0         0        0           0     21  
2              0         0        0           0     21  
3              0         0        0           0     16  
4              0         0        0           0     16  
5              0         0        0           0     16  
6              0         0        0           0     36  
7              0         0        0           0     36  
8              1         0        0           0     16  
9              0         0        0           0     16  
10             0         0        0           0     16  
11             0         0        1           0     21  
12             0         0        0           1     35  
13             0         0        0           0     16  
14             0         0        0           0     20  
15             0         0        0           0     20  
16             0         0        0           1     25  
17             0         0        0           0      1  
18             0         0        0           0     21  
19             0         0        0           1     20  
20             0         0        0           0     16  
21             0         0        0           0     25  
22             0         0        0           1     37  
23             0         0        0           1     20  
24             0         0        0           0     16  
25             0         0        0           1     20  
26             0         0        0           0     16  
27             0         0        0           0     16  
28             0         0        1           0     16  
29             0         0        1           0     21  
30             0         0        0           0     16  
31             0         0        0           0     20  
32             0         0        0           0     35  
33             0         0        0           0     16  
34             0         0        0           0     35  
35             0         1        0           0     16  
36             0         0        0           0     16  
37             0         0        0           0     16  
38             0         0        1           0     38  
39             0         0        0           0     35  
40             0         1        0           0     20  
41             0         1        0           0     16  
42             0         0        0           1     16  
43             0         0        0           0     21  
44             0         0        0           0     21  
45             0         0        0           0     21  
46             0         0        0           0     36  
47             0         0        1           0     16  
48             0         0        0           0     20  
49             0         0        0           0      4  
50             0         0        0           0     25  
51             0         0        0           0      1  
52             0         1        0           0     16  
53             0         1        0           0     16  
54             0         1        0           0     32  
55             0         1        0           0     16  
56             0         0        0           0     16  
57             0         1        0           0     16  
58             0         0        0           0     16  
59             0         0        0           0     36  
60             0         0        0           0     36  
61             0         0        0           0      8  
62             0         0        0           0     32  
63             0         1        0           0     20  
64             0         1        0           0     16  
65             0         0        1           0     16  
66             0         0        0           1     37  
67             0         0        0           1     37  
68             0         0        0           1     21  
69             0         1        0           0     16  
70             0         1        0           0     16  
71             0         0        0           0     16  
72             0         0        0           0     16  
73             0         0        0           0     36  
74             0         0        0           0     16  
...          ...       ...      ...         ...    ...  
877974         0         0        0           0     36  
877975         0         0        0           0     36  
877976         0         0        0           0     20  
877977         0         1        0           0     21  
877978         0         0        1           0     21  
877979         0         0        1           0     36  
877980         0         0        1           0     36  
877981         0         0        0           0     32  
877982         0         1        0           0     21  
877983         0         0        0           0     21  
877984         0         0        0           0     16  
877985         0         1        0           0     21  
877986         0         0        0           0     21  
877987         0         0        0           0      4  
877988         0         1        0           0     34  
877989         0         0        0           0     21  
877990         0         0        0           0     20  
877991         0         1        0           0     21  
877992         0         0        0           0     16  
877993         0         1        0           0     21  
877994         0         0        0           0     36  
877995         0         0        0           0     37  
877996         0         0        0           0     21  
877997         0         0        0           0     21  
877998         0         0        0           0     19  
877999         0         0        0           0     36  
878000         0         0        0           0     36  
878001         0         1        0           0     21  
878002         0         1        0           0     16  
878003         0         0        0           0      1  
878004         0         0        0           0      1  
878005         0         1        0           0     21  
878006         0         1        0           0     35  
878007         0         1        0           0     34  
878008         0         0        0           0     30  
878009         0         0        0           0     21  
878010         1         0        0           0      4  
878011         0         0        0           0     35  
878012         1         0        0           0     13  
878013         0         1        0           0      4  
878014         0         0        0           0     21  
878015         0         0        0           0     30  
878016         0         0        0           0     35  
878017         0         0        0           0     25  
878018         0         0        0           0     21  
878019         0         1        0           0     21  
878020         0         0        0           0     21  
878021         0         0        0           0     35  
878022         0         0        0           0     36  
878023         0         0        0           1     16  
878024         0         0        0           0     21  
878025         0         0        0           0     21  
878026         0         0        0           0     37  
878027         0         1        0           0     37  
878028         0         1        0           0      1  
878029         0         0        0           1     21  
878030         0         0        0           1     28  
878031         0         0        0           0      1  
878032         0         0        0           0     21  
878033         1         0        0           0     35  
878034         1         0        0           0     34  
878035         0         0        0           0      1  
878036         0         0        0           0     16  
878037         0         0        0           0     35  
878038         0         0        0           1     37  
878039         0         0        0           0     21  
878040         0         0        0           0      1  
878041         1         0        0           0     21  
878042         0         0        0           0      1  
878043         0         0        0           0     21  
878044         0         0        1           0     25  
878045         0         0        0           0     16  
878046         0         1        0           0     16  
878047         0         1        0           0     35  
878048         0         0        0           0     12  

[878049 rows x 42 columns]

   
   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264
  • 265
  • 266
  • 267
  • 268
  • 269
  • 270
  • 271
  • 272
  • 273
  • 274
  • 275
  • 276
  • 277
  • 278
  • 279
  • 280
  • 281
  • 282
  • 283
  • 284
  • 285
  • 286
  • 287
  • 288
  • 289
  • 290
  • 291
  • 292
  • 293
  • 294
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • 315
  • 316
  • 317
  • 318
  • 319
  • 320
  • 321
  • 322
  • 323
  • 324
  • 325
  • 326
  • 327
  • 328
  • 329
  • 330
  • 331
  • 332
  • 333
  • 334
  • 335
  • 336
  • 337
  • 338
  • 339
  • 340
  • 341
  • 342
  • 343
  • 344
  • 345
  • 346
  • 347
  • 348
  • 349
  • 350
  • 351
  • 352
  • 353
  • 354
  • 355
  • 356
  • 357
  • 358
  • 359
  • 360
  • 361
  • 362
  • 363
  • 364
  • 365
  • 366
  • 367
  • 368
  • 369
  • 370
  • 371
  • 372
  • 373
  • 374
  • 375
  • 376
  • 377
  • 378
  • 379
  • 380
  • 381
  • 382
  • 383
  • 384
  • 385
  • 386
  • 387
  • 388
  • 389
  • 390
  • 391
  • 392
  • 393
  • 394
  • 395
  • 396
  • 397
  • 398
  • 399
  • 400
  • 401
  • 402
  • 403
  • 404
  • 405
  • 406
  • 407
  • 408
  • 409
  • 410
  • 411
  • 412
  • 413
  • 414
  • 415
  • 416
  • 417
  • 418
  • 419
  • 420
  • 421
  • 422
  • 423
  • 424
  • 425
  • 426
  • 427
  • 428
  • 429
  • 430
  • 431
  • 432
  • 433
  • 434
  • 435
  • 436
  • 437
  • 438
  • 439
  • 440
  • 441
  • 442
  • 443
  • 444
  • 445
  • 446
  • 447
  • 448
  • 449
  • 450
  • 451
  • 452
  • 453
  • 454
  • 455
  • 456
  • 457
  • 458
  • 459
  • 460
  • 461
  • 462
  • 463
  • 464
  • 465
  • 466
  • 467
  • 468
  • 469
  • 470
  • 471
  • 472
  • 473
  • 474
  • 475
  • 476
  • 477
  • 478
  • 479
  • 480
  • 481
  • 482
  • 483
  • 484
  • 485
  • 486
  • 487
  • 488
  • 489
  • 490
  • 491
  • 492
  • 493
  • 494
  • 495
  • 496
  • 497
  • 498
  • 499
  • 500
  • 501
  • 502
  • 503
  • 504
  • 505
  • 506
  • 507
  • 508
  • 509
  • 510
  • 511
  • 512
  • 513
  • 514
  • 515
  • 516
  • 517
  • 518
  • 519
  • 520
  • 521
  • 522
  • 523
  • 524
  • 525
  • 526
  • 527
  • 528
  • 529
  • 530
  • 531
  • 532
  • 533
  • 534
  • 535
  • 536
  • 537
  • 538
  • 539
  • 540
  • 541
  • 542
  • 543
  • 544
  • 545
  • 546
  • 547
  • 548
  • 549
  • 550
  • 551
  • 552
  • 553
  • 554
  • 555
  • 556
  • 557
  • 558
  • 559
  • 560
  • 561
  • 562
  • 563
  • 564
  • 565
  • 566
  • 567
  • 568
  • 569
  • 570
  • 571
  • 572
  • 573
  • 574
  • 575
  • 576
  • 577
  • 578
  • 579
  • 580
  • 581
  • 582
  • 583
  • 584
  • 585
  • 586
  • 587
  • 588
  • 589
  • 590
  • 591
  • 592
  • 593
  • 594
  • 595
  • 596
  • 597
  • 598
  • 599
  • 600
  • 601
  • 602
  • 603
  • 604
  • 605
  • 606
  • 607
  • 608
  • 609
  • 610
  • 611
  • 612
  • 613
  • 614

我们可以快速地筛出一部分重要的特征,搭建一个baseline系统,再考虑步步优化。比如我们这里
简单一点,就只取星期几和街区作为分类器输入特征,我们用scikit-learn中的train_test_split
函数拿到训练集和交叉验证集,用朴素贝叶斯和逻辑回归都建立模型,对比一下它们的表现:

from sklearn.cross_validation import train_test_split
from sklearn import preprocessing
from sklearn.metrics import log_loss
from sklearn.naive_bayes import BernoulliNB
from sklearn.linear_model import LogisticRegression
import time

# 只取星期几和街区作为分类器输入特征
features = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday', 'BAYVIEW', 'CENTRAL', 'INGLESIDE', 'MISSION',
 'NORTHERN', 'PARK', 'RICHMOND', 'SOUTHERN', 'TARAVAL', 'TENDERLOIN']

# 分割训练集(3/5)和测试集(2/5)
training, validation = train_test_split(trainData, train_size=.60)

# 朴素贝叶斯建模,计算log_loss
model = BernoulliNB()
nbStart = time.time()
model.fit(training[features], training['crime'])
nbCostTime = time.time() - nbStart
predicted = np.array(model.predict_proba(validation[features]))
print("朴素贝叶斯建模耗时 %f 秒" %(nbCostTime))
print("朴素贝叶斯log损失为 %f " %(log_loss(validation['crime'],predicted)))

#逻辑回归建模,计算log_loss
model = LogisticRegression(C=.01)
lrStart = time.time()
model.fit(training[features],training['crime'])
lrCostTime = time.time() - lrStart
predicted = np.array(model.predict_proba(validation[features]))
print("逻辑回归建模耗时 %f 秒" %(lrCostTime)) 
print("逻辑回归log损失为 %f" %(log_loss(validation['crime'], predicted)))
   
   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
朴素贝叶斯建模耗时 0.477027 秒
朴素贝叶斯log损失为 2.614108 秒
逻辑回归建模耗时 58.954372 秒
逻辑回归log损失为 2.621150

   
   
  • 1
  • 2
  • 3
  • 4
  • 5

我们可以看到目前的特征和参数设定下,朴素贝叶斯的log损失还低一些,另外我们可以明显看到,
朴素贝叶斯建模消耗的时间远小于逻辑回归建模时间。

from sklearn.cross_validation import train_test_split
from sklearn import preprocessing
from sklearn.metrics import log_loss
from sklearn.naive_bayes import BernoulliNB
from sklearn.linear_model import LogisticRegression
import time

features = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday', 'BAYVIEW', 'CENTRAL', 'INGLESIDE', 'MISSION',
 'NORTHERN', 'PARK', 'RICHMOND', 'SOUTHERN', 'TARAVAL', 'TENDERLOIN']

hourFea = [x for x in range(0,24)]
features = features + hourFea

from sklearn.cross_validation import train_test_split
from sklearn import preprocessing
from sklearn.metrics import log_loss
from sklearn.naive_bayes import BernoulliNB
from sklearn.linear_model import LogisticRegression
import time

# 只取星期几和街区作为分类器输入特征
features = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday', 'BAYVIEW', 'CENTRAL', 'INGLESIDE', 'MISSION',
 'NORTHERN', 'PARK', 'RICHMOND', 'SOUTHERN', 'TARAVAL', 'TENDERLOIN']

# 分割训练集(3/5)和测试集(2/5)
training, validation = train_test_split(trainData, train_size=.60)

# 朴素贝叶斯建模,计算log_loss
model = BernoulliNB()
nbStart = time.time()
model.fit(training[features], training['crime'])
nbCostTime = time.time() - nbStart
predicted = np.array(model.predict_proba(validation[features]))
print("朴素贝叶斯建模耗时 %f 秒" %(nbCostTime))
print("朴素贝叶斯log损失为 %f 秒" %(log_loss(validation['crime'],predicted)))

#逻辑回归建模,计算log_loss
model = LogisticRegression(C=.01)
lrStart = time.time()
model.fit(training[features],training['crime'])
lrCostTime = time.time() - lrStart
predicted = np.array(model.predict_proba(validation[features]))
print("逻辑回归建模耗时 %f 秒" %(lrCostTime)) 
print("逻辑回归log损失为 %f" %(log_loss(validation['crime'], predicted)))
   
   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
朴素贝叶斯建模耗时 0.478027 秒
朴素贝叶斯log损失为 2.613777 秒
逻辑回归建模耗时 58.734359 秒
逻辑回归log损失为 2.621033

   
   
  • 1
  • 2
  • 3
  • 4
  • 5

可以看到在这三个类别特征下,朴素贝叶斯相对于逻辑回归,依旧有一定的优势(log损失更小),
同时训练时间很短,这意味着模型虽然简单,但是效果依旧强大。

参考文献:
http://blog.csdn.net/han_xiaoyang/article/details/50629608

        <link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/markdown_views-ea0013b516.css">
            </div>
  • 0
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值