我们的数据并非没有争议无需改造——用户需要确保 geoJSON 键与熊猫数据框架之间具有1:1的映射。下面就是之前实例所需的简明的数据框架映射:我们的国家信息是一个列有FIPS 码、国家名称、以及经济信息(列名省略)的 CSV 文件:
1
2
3
4
5
|
00000
,US,United States,
154505871
,
140674478
,
13831393
,
9
,
50502
,
100
01000
,AL,Alabama,
2190519
,
1993977
,
196542
,
9
,
41427
,
100
01001
,AL,Autauga County,
25930
,
23854
,
2076
,
8
,
48863
,
117.9
01003
,AL,Baldwin County,
85407
,
78491
,
6916
,
8.1
,
50144
,
121
01005
,AL,Barbour County,
9761
,
8651
,
1110
,
11.4
,
30117
,
72.7
|
在 geoJSON 中,我们的国家形状是以 FIPS 码为id 的(感谢 fork 自 Trifacta 的相关信息)。为了简便,实际形状已经做了简略,在示例数据可以找到完整的数据集:
1
2
3
4
5
6
7
8
9
10
11
|
{
"type"
:
"FeatureCollection"
,
"features"
:[
{
"type"
:
"Feature"
,
"id"
:
"1001"
,
"properties"
:{
"name"
:
"Autauga"
}
{
"type"
:
"Feature"
,
"id"
:
"1003"
,
"properties"
:{
"name"
:
"Baldwin"
}
{
"type"
:
"Feature"
,
"id"
:
"1005"
,
"properties"
:{
"name"
:
"Barbour"
}
{
"type"
:
"Feature"
,
"id"
:
"1007"
,
"properties"
:{
"name"
:
"Bibb"
}
{
"type"
:
"Feature"
,
"id"
:
"1009"
,
"properties"
:{
"name"
:
"Blount"
}
{
"type"
:
"Feature"
,
"id"
:
"1011"
,
"properties"
:{
"name"
:
"Bullock"
}
{
"type"
:
"Feature"
,
"id"
:
"1013"
,
"properties"
:{
"name"
:
"Butler"
}
{
"type"
:
"Feature"
,
"id"
:
"1015"
,
"properties"
:{
"name"
:
"Calhoun"
}
{
"type"
:
"Feature"
,
"id"
:
"1017"
,
"properties"
:{
"name"
:
"Chambers"
}
{
"type"
:
"Feature"
,
"id"
:
"1019"
,
"properties"
:{
"name"
:
"Cherokee"
}
|
我们需要匹配 FIPS 码,确保匹配正确,否则 Vega 无法正确的压缩数据:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
import
json
import
pandas as pd
with
open
(county_geo,
'r'
) as f:
get_id
=
json.load(f)
county_codes
=
[x[
'id'
]
for
x
in
get_id[
'features'
]]
county_df
=
pd.DataFrame({
'FIPS_Code'
: county_codes}, dtype
=
str
)
df
=
pd.read_csv(county_data, na_values
=
[
' '
])
df[
'FIPS_Code'
]
=
df[
'FIPS_Code'
].astype(
str
)
merged
=
pd.merge(df, county_df, on
=
'FIPS_Code'
, how
=
'inner'
)
merged
=
merged.fillna(method
=
'pad'
)
>>>merged.head()
FIPS_Code State Area_name Civilian_labor_force_2011 Employed_2011 \
0
1001
AL Autauga County
25930
23854
1
1003
AL Baldwin County
85407
78491
2
1005
AL Barbour County
9761
8651
3
1007
AL Bibb County
9216
8303
4
1009
AL Blount County
26347
24156
Unemployed_2011 Unemployment_rate_2011 Median_Household_Income_2011 \
0
2076
8.0
48863
1
6916
8.1
50144
2
1110
11.4
30117
3
913
9.9
37347
4
2191
8.3
41940
Med_HH_Income_Percent_of_StateTotal_2011
0
117.9
1
121.0
2
72.7
3
90.2
4
101.2
|
现在,我们可以快速生成不同的等值线:
1
2
|
vis.tabular_data(merged, columns
=
[
'FIPS_Code'
,
'Civilian_labor_force_2011'
])
vis.to_json(path)
|
|
顶 翻译的不错哦! |