使用Python进行有效数据可视化的1个技巧

Yes, you read correctly – this post will only give you 1 tip. I know most posts like this have 5 or more tips. I once saw a post with 15 tips, but I may have been daydreaming at the time. You’re probably wondering what makes this 1 tip so special. “Vik”, you may ask, “I’ve been reading posts that have 7 tips all day. Why should I spend the time and effort to read a whole post for only 1 tip?”

是的,您没有看错–这篇文章只会给您1条提示。 我知道大多数类似这样的帖子都有5条或更多提示。 我曾经看过一篇包含15条提示的帖子,但那时我可能一直在做白日梦。 您可能想知道是什么使这1个提示如此特别。 您可能会问“ Vik”,“我整天都在阅读包含7个提示的帖子。 我为什么要花时间和精力只阅读1条提示来阅读整篇文章?”

I can only answer that data visualization is about quality, not quantity. Like me, you probably spent hours learning about all the various charts that are out there – pie charts, line charts, bar charts, horizontal bar charts, and millions of others. Like me, you thought you understood data visualization. But we were wrong. Because data visualization isn’t about making different types of fancy charts. It’s about understanding your audience and helping them achieve their goals.

我只能回答数据可视化是关于质量,而不是数量。 像我一样,您可能花费了数小时来了解其中存在的所有各种图表-饼图,折线图,条形图,水平条形图以及数百万个其他图表。 和我一样,您认为您了解数据可视化。 但是我们错了。 因为数据可视化与制作不同类型的花式图表无关。 这是关于了解您的受众并帮助他们实现目标的过程。

Oh, this is embarrassing – I just gave away the tip. Well, if you keep reading, I promise that you’ll learn all about making effective data visualization, and why this one tip is useful. By the end, you’ll be able to make useful plots like this:

哦,这很尴尬–我只是给了小费。 好吧,如果您继续阅读,我保证您将学习有关进行有效的数据可视化的全部知识,以及为什么这一技巧有用。 到最后,您将能够做出如下有用的绘图:

Let’s start by talking about some data visualizations in your daily life that may surprise you. Did you know that whenever you see a weather map on TV, check the time on your wall clock, or stop at a traffic light, you’re seeing a visual representation of numeric data? Don’t believe me? Let’s dive a little more into how a wall clock shows time. When the time is 5:05, you don’t see the actual time on the clock. Instead, you see a small “hand” pointing to 5, and a big “hand” pointing to 1, like this:

让我们从谈论日常生活中可能使您感到惊讶的一些数据可视化开始。 您是否知道,只要在电视上看到气象图,查看挂钟上的时间或在红绿灯处停留,便会看到数字数据的直观表示形式? 不相信我吗 让我们进一步研究挂钟如何显示时间。 时间是5:05 ,您不会在时钟上看到实际时间。 相反,您看到一个指向5的小“手”和一个指向1的大“手”,如下所示:

We’ve been trained to translate from this visual representation of the data to a time, 5:05.

我们已经受过训练,可以将数据的可视表示形式转换为时间5:05

Wall clocks are unfortunately an example of data visualization that makes it harder to understand the underlying data. It takes much more mental effort to parse the time on a wall clock than it does for a digital clock. Wall clocks were created before displaying the time on a digital display was possible, so the only solution was displaying the time via two “hands””.

不幸的是,壁钟是数据可视化的一个示例,它使理解基础数据变得更加困难。 解析挂钟上的时间要比数字时钟花费更多的精力。 在可以在数字显示器上显示时间之前创建了挂钟,因此唯一的解决方案是通过两只“指针”显示时间。

Let’s look at a visualization that makes it much easier to understand the underlying data, the weather map. Lets look at this map as an example:

让我们看一下可视化工具,它可以使您更轻松地了解基础数据(天气图)。 让我们以这张地图为例:

Looking at the map above, you can instantly tell that the coasts of Andhra Pradesh and Tamil Nadu are some of the hottest places in India. Arunachal Pradesh and Jammu and Kashmir are some of the coldest. We can see the “lines” along which higher average temperatures transition to lower average temperatures. The map is great for looking at geographic temperature trends despite display issues with the map – some labels overflow their boxes, or are too light.

查看上面的地图,您可以立即看出安得拉邦和泰米尔纳德邦的海岸是印度最热的地方。 阿鲁纳恰尔邦(Arunachal Pradesh)和查mu(Jammu)和克什米尔(K​​ashmir)最冷。 我们可以看到较高的平均温度过渡到较低的平均温度的“线”。 尽管显示出现问题,该地图还是非常适合查看地理温度趋势的信息-一些标签的盒子溢出或太轻。

If we had instead represented this as a table, we would have “lost” a significant amount of data. For example, from the map, we can quickly tell that Hyderabad is colder than the coast of Andhra Pradesh. In order to communicate all of the information in the map, we’d need a table full of temperature data for every place in India, like this but longer:

如果我们改为将其表示为表格,那么我们将“丢失”大量数据。 例如,从地图上我们可以快速看出海得拉巴比安得拉邦的海岸要冷。 为了传达地图中的所有信息,我们需要一个充满温度数据的表格,用于印度的每个地方,像这样,但需要更长的时间:

CityAverage Annual Temperature 年平均温度
0 0 Hyderabad 海得拉巴 27.0 27.0
1 1个 Chennai 钦奈 29.5 29.5
2 2 Raipur 赖布尔 26.0 26.0
3 3 New Delhi 新德里 23.0 23.0

This table is hard to think of in geographic terms. Two cities next to each other in the table might be right next to each other geographically, or extremely far apart. It’s hard to figure out geographic trends when you’re looking at one city at a time, so the table isn’t useful for looking at high level geographic temperature changes.

很难从地理角度考虑该表格。 表中彼此相邻的两个城市在地理位置上可能彼此相邻,或者相距甚远。 一次查看一个城市时,很难弄清地理趋势,因此该表对于查看高水平的地理温度变化没有用。

However, the table is extremely useful for looking up the average temperature of your city – far more useful than the map. You can instantly tell that the average annual temperature of Hyderabad is 27.0 degrees Celsius.

但是,该表对于查找城市的平均温度非常有用-远比地图有用。 您可以立即得知海得拉巴的年平均气温为27.0摄氏度。

Understanding what representations of the data are useful in which contexts is critical for creating effective data visualizations.

了解数据的哪些表示形式在哪些上下文中有用,对于创建有效的数据可视化至关重要。

到目前为止我们学到了什么 (What we’ve learned so far)

  • Visualizations aren’t always better than numbers for representing data.
  • Even a visualization that doesn’t look good can be effective if it matches the goals of the audience.
  • Effective visualization can enable viewers to discover patterns that they could never find using numeric representations.
  • 可视化并不总是比数字更好地代表数据。
  • 如果视觉效果不佳,即使它符合受众的目标,也可以有效。
  • 有效的可视化可以使观看者发现使用数字表示无法找到的模式。

In this post, we’ll learn how to make effective visualizations by walking through visualizing the performance of our investment portfolio. We’ll represent the data a few different ways, and talk about the pros and cons of each approach.

在本文中,我们将通过可视化我们投资组合的绩效来学习如何进行有效的可视化。 我们将以几种不同的方式表示数据,并讨论每种方法的利弊。

Too many tutorials start with making charts, but never discuss why those charts are being made. At the end of this post, you’ll have more insight into what charts are useful in which situations, and be able to more effectively communicate using data. If you want to go more in depth, you should try our courses on exploratory data visualization and storytelling through data visualization.

太多的教程从制作图表开始,但是从不讨论为什么制作这些图表。 在本文的最后,您将更深入地了解哪些图表在哪些情况下有用,并且能够使用数据更有效地进行交流。 如果您想进一步深入,应该尝试一下探索性数据可视化通过数据可视 讲故事的课程。

We’ll be using Python 3.5 and Jupyter notebook in case you want to follow along.

如果您想继续学习,我们将使用Python 3.5Jupyter笔记本

数据的表格表示 (Tabular representations of the data)

Let’s say that we own a few shares of stock, and we want to track their performance:

假设我们拥有一些股票,我们想跟踪它们的表现:

  • AAPL – 500 shares
  • GOOG – 450 shares
  • BA – 250 shares
  • CMG – 200 shares
  • NVDA – 100 shares
  • RHT – 500 shares
  • AAPL – 500股
  • GOOG – 450股
  • BA – 250股
  • CMG – 200股
  • NVDA – 100股
  • RHT – 500股

We bought all of the shares on November 7th, 2016, and we want to track their performance to date. We first need to download the daily share price data, which we can do with the yahoo-finance package. We can install the package using pip install yahoo-finance.

我们于2016年11月7日购买了所有股票,我们希望跟踪其迄今为止的表现。 我们首先需要下载每日股价数据,我们可以使用yahoo-finance软件包来完成此工作。 我们可以使用pip install yahoo-finance安装软件包。

In the below code, we:

在下面的代码中,我们:

  • Import the yahoo-finance package.
  • Set a list of symbols to download.
  • Loop through each symbol
    • Download data from 2016-11-07 to the previous day.
    • Extract the closing prices for each day.
  • Create a dataframe with all of the price data.
  • Display the dataframe.
  • 导入yahoo-finance程序包。
  • 设置要下载的符号列表。
  • 遍历每个符号
    • 2016-11-07到前一天下载数据。
    • 提取每天的收盘价。
  • 创建一个包含所有价格数据的数据框。
  • 显示数据框。
from from yahoo_finance yahoo_finance import import Share
Share
import import pandas pandas as as pd
pd
from from datetime datetime import import datedate , , timedelta

timedelta

symbols symbols = = [[ "AAPL""AAPL" , , "GOOG""GOOG" , , "BA""BA" , , "CMG""CMG" , , "NVDA""NVDA" , , "RHT""RHT" ]

]

data data = = {}
{}
days days = = []
[]
for for symbol symbol in in symbolssymbols :
    :
    share share = = ShareShare (( symbolsymbol )
    )
    yesterday yesterday = = (( datedate .. todaytoday () () - - timedeltatimedelta (( daysdays == 11 )))) .. strftimestrftime (( "%Y-%m-"%Y-%m- %d%d "" )
    )
    prices prices = = shareshare .. get_historicalget_historical (( '2016-11-7''2016-11-7' , , yesterdayyesterday )
    )
    close close = = [[ floatfloat (( pp [[ "Close""Close" ]) ]) for for p p in in pricesprices ]
    ]
    days days = = [[ pp [[ "Date""Date" ] ] for for p p in in pricesprices ]
    ]
    datadata [[ symbolsymbol ] ] = = close

close

stocks stocks = = pdpd .. DataFrameDataFrame (( datadata , , indexindex == daysdays )
)
stocksstocks .. headhead ()
()
AAPL AAPL BA BA CMG CMG GOOG 高格 NVDA NVDA RHT RHT
2017-02-08 2017-02-08 132.039993 132.039993 163.809998 163.809998 402.940002 402.940002 808.380005 808.380005 118.610001 118.610001 78.660004 78.660004
2017-02-07 2017-02-07 131.529999 131.529999 166.500000 166.500000 398.640015 398.640015 806.969971 806.969971 119.129997 119.129997 78.860001 78.860001
2017-02-06 2017-02-06 130.289993 130.289993 163.979996 163.979996 395.589996 395.589996 801.340027 801.340027 117.309998 117.309998 78.220001 78.220001
2017-02-03 2017-02-03 129.080002 129.080002 162.399994 162.399994 404.079987 404.079987 801.489990 801.489990 114.379997 114.379997 78.129997 78.129997
2017-02-02 2017-02-02 128.529999 128.529999 162.259995 162.259995 423.299988 423.299988 798.530029 798.530029 115.389999 115.389999 77.709999 77.709999

As you can see above, this gives us a table where each column is a stock symbol, each row is a date, and each cell is the price of that stock symbol on that date. The entire dataframe has 62 rows. This is very good if we want to lookup the price of a specific stock on a specific day. For example, I can quickly tell that AAPL shares cost 128.75 at market close on February 1st, 2017.

如您在上面看到的,这为我们提供了一个表格,其中每一列都是股票代码,每一行是日期,每个单元格是该股票在该日期的价格。 整个数据框有62行。 如果我们要查询特定日期特定股票的价格,这非常好。 例如,我可以很快说出AAPL股价在2017年2月1日收盘时的价格为128.75

However, we might only care about if we’ve made or lost money off of each stock symbol. We can find the difference between the price of each share when we bought it, and the current price.

但是,我们可能只在乎我们是否从每个股票代码中获利或亏损。 我们可以找到购买时每股价格与当前价格之间的差异。

In the below code, we subtract the stock prices when we bought them from the current stock prices.

在下面的代码中,我们从当前股价中减去购买时的股价。


AAPL    19.879989
BA      20.949997
CMG     13.100006
GOOG    18.820007
NVDA    46.040001
RHT      1.550003
dtype: float64

Great! It looks like we made money on every investment. However, we can’t tell by what percentage our investments have increased. We can do this with a slightly more complex formula:

大! 看起来我们每笔投资都赚了钱。 但是,我们无法确定我们的投资增加了多少百分比。 我们可以使用稍微复杂一点的公式来做到这一点:

pct_change pct_change = = (( stocksstocks .. locloc [[ "2017-02-06""2017-02-06" ] ] - - stocksstocks .. locloc [[ "2016-11-07""2016-11-07" ]) ]) / / stocksstocks .. locloc [[ "2016-11-07""2016-11-07" ]
]
pct_change
pct_change

AAPL    0.180056
BA      0.146473
CMG     0.034249
GOOG    0.024051
NVDA    0.645994
RHT     0.020217
dtype: float64

It looks like our investments have done extremely well percentage-wise. But it’s hard to tell how much money we’ve made overall. Let’s multiply the price change by our share counts to see how much we’ve made:

看来我们的投资在百分比方面做得非常好。 但是很难说我们总共赚了多少钱。 让我们将价格变化乘以我们的股份数量,看看我们赚了多少:


AAPL    9939.99450
BA      5237.49925
CMG     2620.00120
GOOG    8469.00315
NVDA    4604.00010
RHT      775.00150
dtype: float64

Finally, we can add up how much we’ve made in total:

最后,我们可以累加总计多少:

sumsum (( portfolio_changeportfolio_change )
)

31645.49969999996

And look at our purchase price to eyeball how much we’ve made on a percentage basis:

看一下我们的购买价格,以百分比为基础我们赚了多少:


565056.50745000003

We’ve gotten pretty far with numeric data representations. We were able to figure out how much our portfolio value increased. In many cases, data visualization isn’t necessary, and a few numbers can express everything you want to share. In this section, we learned that:

数字数据表示已经走得很远了。 我们能够弄清楚我们的投资组合价值增加了​​多少。 在许多情况下,数据可视化不是必需的,并且一些数字可以表达您想要共享的所有内容。 在本节中,我们了解到:

  • Numeric representations of data can be enough to tell a story.
  • It’s good to try to simplify tabular data when you can before moving to visualization.
  • Understanding the goals of your audience is important to effectively representing data.
  • 数据的数字表示足以讲述一个故事。
  • 最好先简化表格数据,然后再转向可视化。
  • 了解受众的目标对于有效表示数据很重要。

Numeric representations stop working well when you want to find patterns or trends in your data. Let’s say we wanted to figure out if any stocks were more volatile in December, or if any stocks went down then back up. We could try to use measures like standard deviation, but they wouldn’t give us the whole story:

当您想在数据中查找模式或趋势时,数字表示不能很好地工作。 假设我们想弄清楚12月份是否有更多的股票波动,或者是否有股票下跌然后回升。 我们可以尝试使用标准偏差之类的度量,但是它们不能为我们提供完整的信息:

stocksstocks .. stdstd ()
()

AAPL     6.135476
BA       6.228163
CMG     15.352962
GOOG    21.431396
NVDA    11.686528
RHT      3.225995
dtype: float64

The above tells us that 68% of AAPL closing share prices in our time period are within 5.54 of the mean price. It’s hard to tell if this indicates low or high volatility, though. It’s also hard to tell if AAPL has recently increased in price or not.

上文告诉我们,在这段时间内, AAPL收盘价的68%在ASP的5.54以内。 不过,很难说这是低波动还是高波动。 也很难说AAPL最近是否涨价了。

In the next section, we’ll figure out how to visualize our data to identify these hard to quantify trends.

在下一节中,我们将说明如何可视化数据以识别难​​以量化的趋势。

绘制所有股票代码 (Plotting all our stock symbols)

The first thing we can do is make a plot of each stock series. We can do this by using the pandas.DataFrame.plot method. This will create a line plot of the daily closing prices for each stock symbol. We need to first sort the dataframe in reverse order, as currently, it is sorted in descending order of date, and we want it to be in ascending order:

我们要做的第一件事是绘制每个股票系列的图。 我们可以通过使用pandas.DataFrame.plot方法来实现。 这将创建每个股票品种每日收盘价的折线图。 我们需要首先以相反的顺序对数据框进行排序,因为目前,它是按日期的降序排序的,我们希望它按升序排序:


<matplotlib.axes._subplots.AxesSubplot at 0x1105cd048>

The above plot is a good start, and we’ve come very far in a short amount of time. Unfortunately, the chart is a bit cluttered, and it’s hard to tell the overall trends for some of the lower priced symbols. Let’s normalize the chart to show each daily closing price as a fraction of the starting price:

上面的情节是一个好的开始,我们已经在很短的时间内取得了很大进展。 不幸的是,图表有点混乱,很难说出一些价格较低的符号的总体趋势。 让我们对图表进行归一化,以将每个每日收盘价显示为起始价的一部分:

normalized_stocks normalized_stocks = = stocks stocks / / stocksstocks .. locloc [[ "2016-11-07""2016-11-07" ]
]
normalized_stocksnormalized_stocks .. plotplot ()
()

<matplotlib.axes._subplots.AxesSubplot at 0x10f81b8d0>

This plot is much better for seeing relative trends in each stock price. Each line shows us how the value of the stock is changing relative to its purchase price. This shows us which of our stocks are increasing on a percentage basis, and which ones aren’t. We can see that the price of NVDA shares increased very steeply soon after we bought it, and have continued to increase in value. RHT seems to have lost quite a bit of value at the end of December, but the price has been recovering steadily.

该图对于查看每个股票价格的相对趋势要好得多。 每行都向我们显示了股票价值相对于其购买价格的变化。 这向我们显示了哪些股票在百分比基础上增长,而哪些没有。 我们可以看到, NVDA股票的价格在我们购买之后很快就急剧上升,并且持续上涨。 RHT似乎在12月底失去了相当一部分价值,但价格一直在稳步回升。

Unfortunately, there are some visual issues with this plot that make the plot hard to read. The labels are squished together, and it’s hard to see what happens to GOOG, CMG, RHT, BA, and AAPL since the lines are bunched together. We’ll increase the size of the plot using the figsize keyword argument, and increase the width of the lines to fix these issues. We’ll also increase the axis label and axis font sizes to make them easier to read.

不幸的是,该绘图存在一些视觉问题,使该绘图难以阅读。 标签被挤在一起,由于线被捆在一起,所以很难看到GOOGCMGRHTBAAAPL会发生什么。 我们将使用figsize关键字参数来增加绘图的大小,并增加线条的宽度来解决这些问题。 我们还将增加轴标签和轴字体大小,以使其更易于阅读。


<matplotlib.legend.Legend at 0x10eaba160>

In the above plot, it’s far easier to separate the lines visually since we have more space, and they’re thicker. The labels are also easier to read since they’re larger.

在上面的图中,由于我们有更多的空间并且线条更粗,因此在视觉上分隔线条要容易得多。 由于标签较大,因此也更易于阅读。

Let’s say that we want to see how much of our total portfolio value is in each stock over time, on a percentage basis. We’d need to first multiply each stock price series by the number of shares we hold, then divide by the total portfolio value, then make an area plot. This would let us see if some stocks are increasing enough to constitute a much larger share of our overall portfolio.

假设我们要查看一段时间内每只股票中有多少投资组合总价值。 我们需要先将每个股票价格系列乘以我们持有的股票数量,然后除以总投资组合价值,然后绘制面积图。 这将使我们看到某些股票的增长是否足以构成我们整体投资组合的更大份额。

In the below code, we:

在下面的代码中,我们:

  • Multiply each stock price by the number of shares we hold to get the total worth of the shares we own of each symbol.
  • Divide each row by the total value of the portfolio on that date to figure out what percentage of our portfolio value each stock is.
  • Plot the values in an area plot, where the y axis goes from 0 to 1.
  • Hide the y axis labels.
  • 将每个股票价格乘以我们持有的股票数量,即可得出每个交易品种所拥有的股票总价值。
  • 将每一行除以该日期的投资组合总价值,以求出每只股票占投资组合价值的百分比。
  • 在y轴从01的面积图中绘制值。
  • 隐藏y轴标签。
portfolio portfolio = = stocks stocks * * share_counts
share_counts
portfolio_percentages portfolio_percentages = = portfolioportfolio .. applyapply (( lambda lambda xx : : xx // sumsum (( xx ), ), axisaxis == 11 )
)
portfolio_percentagesportfolio_percentages .. plotplot (( kindkind == "area""area" , , ylimylim == (( 00 ,, 11 ), ), figsizefigsize == (( 1515 ,, 88 ), ), fontsizefontsize == 1414 )
)
pltplt .. yticksyticks ([])
([])
pltplt .. legendlegend (( fontsizefontsize == 1414 )
)

<matplotlib.legend.Legend at 0x10ea8cda0>

As you can see above, most of our portfolio’s value is in GOOG stock. The overall allocation of dollars per stock symbol hasn’t changed much since we purchased them. From looking at the data in a different way earlier, we know that the price of NVDA has grown quite quickly in the past few months, but from this view, we can see that its total value isn’t that much of our portfolio. This means that although the stock price of NVDA has grown substantially, it hasn’t had a huge affect on our overall portfolio value.

正如您在上面看到的,我们投资组合的大部分价值都在GOOG股票中。 自我们购买以来,每个股票代号的美元总分配没有太大变化。 通过较早地以不同的方式查看数据,我们知道NVDA的价格在过去几个月中增长非常快,但是从这种观点来看,我们可以看到NVDA的总价值并不是我们投资组合的很大。 这意味着,尽管NVDA的股价已大幅上涨,但并未对我们的整体投资组合价值产生重大影响。

Note how the chart above is fairly hard to parse and see trends in. This is an example of a chart that’s usually better as a series of numbers that show the average percentage of portfolio value each stock comprises. A good way to think about this is “what questions can we answer better with this chart than with any other chart?” If the answer is “no questions”, then you’re probably better off with something else.

请注意,上面的图表很难解析并查看其趋势。这是图表的示例,通常最好用一系列数字来显示每只股票所占投资组合价值的平均百分比。 对此进行思考的一个好方法是“与其他任何图表相比,我们可以用该图表回答什么问题?” 如果答案是“没有问题”,那么您可能会选择其他更好的选择。

To get a better handle on our overall portfolio value over time, we can plot it out:

为了更好地处理一段时间后我们的整体投资组合价值,我们可以将其绘制出来:


<matplotlib.axes._subplots.AxesSubplot at 0x110dede48>

When looking at the above plot, we can see that our portfolio lost a good amount of money around the beginning of November, the end of December, and the end of January. When we look at some of our previous plots, we can discover that this is mostly due to drops in the prices of GOOG, which is most of our portfolio value.

查看上面的图,我们可以看到我们的投资组合在11月初,12月底和1月底损失了很多钱。 当我们查看一些以前的图时,我们可以发现这主要是由于GOOG价格下跌所致,这是我们投资组合价值的大部分。

Being able to visualize the data from different angles helps us untangle the story of our overall portfolio, and answer questions more intelligently. For instance, making these plots helped us figure out:

能够从不同角度可视化数据有助于我们理清整个投资组合的故事,并更智能地回答问题。 例如,制作这些图有助于我们弄清楚:

  • The overall portfolio value trends.
  • Which stocks make up what percentage of our portfolio value.
  • The movement of the individual stocks.
  • 整体投资组合价值趋势。
  • 哪些股票占我们投资组合价值的百分比。
  • 个别股票的变动。

Without understanding all three points, we wouldn’t be able to figure out why the price of our portfolio is changing. To tie this all the way back to the tip we started this post with, understanding your audience and the questions they’ll ask will help you design visualizations that meet their goals. The key to effective visualization is ensuring that it helps your audience understand complex tabular data more easily.

如果不理解这三点,我们将无法弄清楚为什么投资组合的价格在变化。 为了使这一切一直回到我们开始这篇文章的技巧,了解您的听众以及他们会问的问题将帮助您设计满足其目标的可视化。 有效可视化的关键是确保它可以帮助您的受众更轻松地理解复杂的表格数据。

下一步 (Next steps)

In this post, you’ve learned:

在这篇文章中,您了解了:

  • How to simplify numeric data.
  • What questions can be answered with numeric data, and what questions require visualizations.
  • How to frame visualizations to answer questions.
  • Why we make visualizations in the first place.
  • Why some charts aren’t as useful to our audience.
  • 如何简化数字数据。
  • 可以使用数字数据回答哪些问题,哪些问题需要可视化。
  • 如何构图可视化来回答问题。
  • 为什么我们首先进行可视化。
  • 为什么有些图表对我们的受众没有那么有用。

翻译自: https://www.pybloggers.com/2017/02/1-tip-for-effective-data-visualization-in-python/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Python爬虫实战数据可视化分析》是李巍所著的一本关于利用Python爬虫和数据可视化工具进行数据分析和展示的实用指南。该书主要包含以下内容: 首先,书中介绍了Python爬虫的基础知识和常用的爬虫库,如Requests和BeautifulSoup等。读者可以学习如何通过爬虫获取数据,并进行初步的数据清洗和处理。 其次,该书详细介绍了数据可视化的相关工具和库,如Matplotlib和Seaborn等。这些工具可以帮助读者将数据以图表的形式展示出来,从而更直观地理解数据并进行进一步的分析。 接着,书中还包含了一些实战案例,通过实际示例的讲解,读者可以学习如何利用Python爬虫和数据可视化工具进行真实世界的数据分析任务。例如,可以通过爬取网站上的股票数据,然后使用数据可视化工具展示出股票走势图和相关统计指标。 最后,该书还介绍了一些高级的数据可视化技术,如交互式数据可视化、地理空间数据可视化等。这些技术可以帮助读者进一步深入数据分析领域,探索更多有趣和复杂的数据可视化任务。 总之,通过阅读《Python爬虫实战数据可视化分析》,读者可以系统地学习和掌握利用Python爬虫和数据可视化工具进行数据分析和展示的方法和技巧。这对于从事数据分析、数据科学和相关领域的人员来说都是一本非常实用的参考书。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值