Information Visualization:How to discover and improve visual charts

Analysis and improvement of the chart “comparison of average total price of housing resources in Hangzhou city”

1. Background and why we choice the topic.

1.1 Background.

At present, house prices are very high all over China, especially in the first tier cities. Young people may not be able to afford a house on their own. This has led to the need for young people to start planning for future cities to live in. In this way, they can choose a city suitable for them to live in according to their ability. But in fact, they don’t know where to start because they don’t have house price information across the country. Therefore, the research results of housing price survey all over the country are of great reference significance for young people. Now young people are in great need of house price data from all over the country.

1.2 Why we choice the topic.

Our team found some house price research reports from the Internet. One of them is about the house price survey of Hangzhou, which has just entered the first tier city of China. Hangzhou is a hot city in China. Its information industry is very developed. Its other name is “the city of the Internet”. It attracts the attention of countless young people. Therefore, the housing price information survey report of Hangzhou is of great significance to many young people.

In this research project, there is a chart of “the average total price comparison of housing resources in Hangzhou city”. This picture contains the information that young people are eager to get - the average house price information in Hangzhou. This chart expresses a lot of important and useful information for young people, but it is also mixed with a lot of useless information, which is not convenient for young people to understand the key information to be expressed in the diagram. This chart has a lot of room for improvement. At the same time, because this chart is of great significance to young people, let’s analyze this chart and try to improve it.

The origine chart:

在这里插入图片描述
(The horizontal axis represents the area and the vertical axis represents the average unit price of the house.The title of the chart is “comparison of average total prices of housing resources in different urban areas of Hangzhou”.)

2. Analyze the original chart.

2.1 Introduce the contents of chart 1.

The chart of “comparison of average total price of housing resources in different urban areas of Hangzhou” is a bar chart. The original chart contains the title of “comparison of average total price of housing resources in different urban areas of Hangzhou”; the coordinate axis; the labels of “urban area” and “total price / 10000 yuan”, with 100 as the price scale and urban area as the unit, and the blue color bar; the regions are arranged from left to right in the order of high to low house prices.

From this chart, we can get the ranking of regions according to the average house prices. And we can get the approximate price of the average house price in each region. For example, we can see in this chart that the average house price in West Lake District is the highest, which is estimated to reach 4.8 million yuan / set. From left to right, Binjiang District, Shangcheng district and Gongshu District are the second highest in average house prices…etc. Fuyang district and Qiantang new district are the lower average housing prices in Hangzhou, with an average price of about 2.5 million yuan / set.

2.2 Visual variables in the chart.

The visual variables included in the graph are: the style, size, direction, sorting, spacing and filling color of the bar; the size, font, direction and spacing of the text; the hue, brightness and chroma of the color. In general, visual variables are not complex.We can modify and add new visual variables appropriately on the original visual variables

2.3 Analyze the expression effect of the original chart.

This chart interface is relatively neat, there are not many variables or legends, so readers can get the important information they want with less effort. For example, the reader can input sorted regional data into the brain according to the level of the bar at a glance. At the same time, according to the vertical axis of the concise scale of 100 units, the approximate range of housing prices in various regions is introduced into the brain. The brain then links regional data to house prices. Readers can easily get the important information that the chart wants to express, and the amount of information is not large. The information they get from charts can easily change from short-term memory to long-term memory. The effect of chart expression is good, it makes users easily get the important information that the chart wants to express, and never forget it for a long time.

This map can meet the young people’s demand for housing price data, but it does not mean that the map is perfect. There are problems in the map and there is room for improvement.

3. Try to restore the original chart

3.1 Get data.

We got 30772 pieces of data from the research report. Here’s part of the data:

(The data table has 14 columns :Property rights; attention; area; price; community; life; total price / 10000 yuan; house type; house code; listing time; orientation; floor; decoration; area.)

Property rightsattentionareapricecommunitylifetotal price / 10000 yuanhouse typehouse codelisting timeorientationfloordecorationarea

A part of the original data chart:

产权关注区域单价小区年限总价/万元户型房屋编码挂牌时间朝向楼层装修情况面积
70年0余杭临平21015元/平米众安理想湾2015年建/板楼2103室2厅1031050130262019-06-12南 北低楼层/共33层平层/精装99.93平米
70年4余杭临平28416元/平米众安理想湾2016年建/板塔结合7806室2厅1031043249062019-04-04联排/共3层毛坯274.5平米
70年2余杭临平17323元/平米众安理想湾2015年建/板楼2203室2厅1031028551202018-09-07高楼层/共33层精装127平米
70年0余杭瓶窑19613元/平米北湖绿洲花园2013年建/板楼5605室2厅1031044198972019-04-14联排/共3层毛坯285.53平米
70年0余杭瓶窑22314元/平米北湖绿洲花园2013年建/板楼6005室3厅1031046635982019-05-08共3层毛坯268.9平米
70年2余杭瓶窑14946元/平米北湖绿洲花园未知年建/板楼2754室2厅1031032123372018-10-25南 北高楼层/共11层毛坯184平米

3.2 Write code.

Python code for drawing the original chart:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

from pyecharts import options as opts
from pyecharts.charts import Bar

import warnings
warnings.filterwarnings('ignore')
#Categorizing data by region
def location(x):
    if "临安" in x: return "临安市"
    elif "上城" in x: return "上城区"
    elif "下城" in x: return "下城区"
    elif "江干" in x: return "江干区"
    elif "拱墅" in x: return "拱墅区"
    elif "西湖" in x: return "西湖区"
    elif "滨江" in x: return "滨江区"
    elif "萧山" in x: return "萧山区"
    elif "余杭" in x: return "余杭区"
    elif "富阳" in x: return "富阳区"
    elif "钱塘" in x: return "钱塘新区"
    else: return "其他"
#Read data
data=pd.read_csv("C:\\Users\\14110\\Desktop\\house.csv")

#Processing missing data
data.dropna(how="any",inplace=True)

#Call the region classification function
data["地理位置"]=data["区域"].apply(location)

#Calculate the average house price per region
sum_area=data.groupby("地理位置")["总价/万元"].mean().sort_values(ascending=False).reset_index()
#draw
plt.figure(figsize=(8, 6))
ax=sns.barplot(sum_area["地理位置"],sum_area["总价/万元"],palette=sns.color_palette('Blues_r'))
ax.set_title("杭州市各城区房源平均总价对比")
ax.set_xlabel("城区") 
ax.set_ylabel("总价/万元") 
plt.show()

#The same meaning:
'''
plt.figure(figsize=(8, 6))
ax=sns.barplot(sum_area["geographical position"],sum_area["Total price/10000 yuan"],palette=sns.color_palette('Blues_r'))
ax.set_title("Comparison of average total price of housing in Hangzhou")
ax.set_xlabel("region") 
ax.set_ylabel("Total price/10000 yuan")
plt.show()
'''

3.3 Code effect.

在这里插入图片描述

(The horizontal axis represents the area and the vertical axis represents the average unit price of the house.The title of the chart is “comparison of average total prices of housing resources in different urban areas of Hangzhou”.)

4. These aspects of the original chart need to be improved.Let us do it.

  1. Color. The original image has a blue gradient. We don’t think it’s appropriate for the author to do this. Gradient color will cause users to think “what’s the meaning of gradient color?” It is easy for users to miss the key information presented in the chart. Not only that, the original image uses two rounds of gradient colors. We think that it is completely unnecessary for the author to do so, and the author’s practice will greatly distract the user’s energy. Users will think, “each color corresponds to two bars. What’s the relationship between them?” In fact, there is no relationship between the paired bars except for contrast. For improvement, we decided to use a single color fill bar. Considering the user experience, we decided to use a softer light red filling bar and a light yellow background for the chart.

  2. Data label. There is no grid guide in this chart, so it is difficult for users to know the data size represented by each bar. Therefore, data labels should be added to each bar in the diagram. These data tags can not only visually show the data size represented by each bar, but also reduce the user’s energy use, making it easier for users to focus on obtaining other key information in the graph.

  3. Remove the top and right axis borders of the chart. In this way, it gets rid of the traditional black box visual chart, and the chart is more beautiful. The improved chart doesn’t consume users’ energy, and the beautiful chart can make users feel relaxed and help them concentrate on obtaining the information in the chart.

  4. Add suspension box. We add a suspension box for each bar. When the user’s mouse cursor is placed on the bar, a floating box will pop up next to the bar, which displays the area represented by this bar and the average house price in this area. This kind of drawing method is relatively new, which can arouse the user’s interest in the chart. The display of the chart itself does not add additional content, nor does it interfere with readers’ access to key information.

5. Try to implement improvement

5.1 Combined with the above code, make the following code.

from pyecharts import options as opts
from pyecharts.charts import Bar
#Keep one decimal place for house price data,
#it is convenient for data label representation.
sum_area['总价/万元'] = round(sum_area['总价/万元'],1)
#draw
bar2=Bar(init_opts=opts.InitOpts(theme='vintage',width = '650px', height='400px'))
bar2.add_xaxis(sum_area["地理位置"].to_list())
bar2.add_yaxis("总价/万元",sum_area["总价/万元"].to_list())
bar2.set_series_opts(label_opts=opts.LabelOpts(is_show=True))
bar2.set_global_opts(title_opts=opts.TitleOpts(title="杭州市各城区房源平均总价对比"),yaxis_opts=opts.AxisOpts(
            name='总价/万元'),xaxis_opts=opts.AxisOpts(name='地理位置',axislabel_opts={"interval":"0","rotate":45}))
bar2.render_notebook()

#The same meaning:
'''
bar2=Bar(init_opts=opts.InitOpts(theme='vintage',width = '650px', height='400px'))
bar2.add_xaxis(sum_area["geographical position"].to_list())
bar2.add_yaxis("Total price/10000 yuan",sum_area["Total price/10000 yuan"].to_list())
bar2.set_series_opts(label_opts=opts.LabelOpts(is_show=True))
bar2.set_global_opts(title_opts=opts.TitleOpts(title="Comparison of average total price of housing in Hangzhou"),yaxis_opts=opts.AxisOpts(
            name='Total price/10000 yuan'),xaxis_opts=opts.AxisOpts(name='geographical position',axislabel_opts={"interval":"0","rotate":45}))
bar2.render_notebook()
'''

5.2 Code effect.

在这里插入图片描述

(The horizontal axis represents the area and the vertical axis represents the average unit price of the house.The title of the chart is “comparison of average total prices of housing resources in different urban areas of Hangzhou”.)

The chart has a special function that it can pop up a hover window to show some information when the mouse cursor points to one of the bars.The website and other apps can use the function,but now it is only a picture.So we can not use the abrove function now.

The special function is like this:

在这里插入图片描述

6. Conclusion.

Referring to the theory of visualization and combining with practice, we improve the expression effect of the original icon, and add some small functions. At this point, readers can easily get more accurate information from the chart. We think that if the author of the original article can use our improved chart, it will be more helpful for young people who are eager to get the conclusion of housing price research.

 

Special thanks to the research report below for providing the original chart,data and part of the code:

https://www.kesci.com/mw/notebook/5eec4fe2e5f796002c2d69cb

 

Welcome anyone to express views in the comments section.Thank you!
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值