python 可视化工具_最佳的python可视化工具

python 可视化工具

Disclaimer: I work for Datapane

免责声明:我为Datapane工作

动机 (Motivation)

There are amazing articles on data visualization on Medium every day. Although this comes at the cost of information overload, it shouldn’t prevent you from exploring interesting articles since you can learn many new techniques for creating effective visualizations for your projects.

每天都有大量关于Medium上的数据可视化的惊人文章。 尽管这是以信息过载为代价的,但它不应阻止您浏览有趣的文章,因为您可以学习许多用于为项目创建有效的可视化效果的新技术。

I have gone through the recent Medium posts on Python visualization and put together the best ones — with the hope that it will make it easier for you to explore them yourself. I’ve submitted these to the Datapane gallery, which is hosting them for us.

我浏览了最近有关Python可视化的中级文章,并汇总了最佳文章-希望它可以使您自己更轻松地探索它们。 我已将它们提交给Datapane画廊 ,该画廊正在为我们托管。

If you don’t know Datapane already, it is an open-source framework for people who analyze data in Python and need a way to share their results. Datapane hosts a free public platform with a gallery and community of people who share and collaborate on Python data visualization techniques.

如果您还不了解Datapane ,那么它是一个开放源代码框架,供那些使用Python分析数据并需要一种共享结果的人员使用。 Datapane拥有一个免费的公共平台,该平台带有画廊和社区,这些社区和社区的人们共享和协作使用Python数据可视化技术。

In this article, I will use plots in the gallery as examples to show what factors make up an effective plot, introduce different kinds of plots, and how to create them yourself!

在本文中,我将以画廊中的地块为例来说明构成有效地块的因素,介绍各种类型的地块,以及如何自己创建它们!

使用Python的管道中的主成分分析和SVM (Principal Component Analysis and SVM in a Pipeline with Python)

有趣的主意 (Interesting Idea)

In this article, Saptashwa Bhattacharyya combines SVM, PCA, and Grid-search Cross-Validation to create a pipeline to find the best parameters for binary classification. He then plots a decision boundary to present how well our algorithm has performed.

本文中Saptashwa Bhattacharyya结合了SVM,PCA和网格搜索交叉验证来创建管道,以找到用于二进制分类的最佳参数。 然后,他绘制了一个决策边界,以展示我们的算法的性能。

令人印象深刻的可视化 (Impressive Visualization)

Joint-plots

联合图

Joint-plot is really helpful in showing both the distribution and the relationship between 2 variables in one plot. The darker the hexagon, the more number of points (observations) fall in that region

联合图确实有助于显示一个图中两个变量的分布和关系。 六边形越深,该区域内的点(观测值)越多

Contour plot and Kernel density estimation: KDE (Kernel density estimation) is a useful statistical tool that lets you create a smooth curve given a set of data. This can be useful if you want to visualize just the “shape” of some data (instead of the discrete histogram).

等高线图和内核密度估计: KDE(内核密度估计)是一个有用的统计工具,可让您根据一组数据创建平滑曲线。 如果您只想可视化某些数据的“形状”(而不是离散的直方图),这将很有用。

Pair plots: By looking at the pair plots, it is much easier to compare the correlation among different pairs of variables. In the plot below, you could see that the mean area has a strong correlation with the mean radius. The difference in color is also helpful to know the behavior of each label in each pair — it is really clear!

配对图 :通过查看配对图,比较不同变量对之间的相关性要容易得多。 在下面的图中,您可以看到平均面积与平均半径有很强的相关性。 颜色上的差异也有助于了解每对标签中每个标签的行为-确实很清楚!

Contour plot of SVM: This contour plot is really helpful to find the percentage that the point lies in that area actually belongs to that area. I especially like the contour plot below because I can see which regions the malignant cells, benign cells, and support vectors lie and it helps me understand how SVM works.

SVM的等高线图:该等高线图确实有助于发现该点实际位于该区域内的百分比。 我特别喜欢下面的轮廓图,因为我可以看到恶性细胞,良性细胞和支持向量位于哪些区域,这有助于我了解SVM的工作原理。

3D SVM plot: Even though we often see 2D SVM plots, most of the time, data is multi-dimensional. Seeing this 3D plot is really helpful to understand how SVM works in multi-dimensional space.

3D SVM图:即使我们经常看到2D SVM图,大多数情况下,数据还是多维的。 看到此3D图确实有助于理解SVM在多维空间中的工作方式。

探索资源 (Resources to Explore)

使用Plotly可视化Gapmind和Basketball数据集 (Visualize Gapmind and Basketball Dataset with Plotly)

有趣的主意 (Interesting Idea)

A good plot is not only a beautiful, but also provides the right message to viewers. Without the right proportions in the graph, viewers will interpret the message in a different way.

一个好的情节不仅是美丽的,而且还向观众提供正确的信息。 如果图表中没有正确的比例,查看者将以不同的方式解释消息。

That is why having the right proportions is so important. In this article, JP Hwang shows you how to create effective plots with the right proportions.

这就是为什么拥有正确比例如此重要的原因。 在本文中JP Hwang向您展示了如何以正确的比例创建有效的地块。

令人印象深刻的可视化 (Impressive Visualization)

Plots show the change over time

图表显示了随着时间的变化

This plot effectively represents the percentage of different continents in the world in a snapshot of time:

该图有效地表示了时间快照中世界各大洲的百分比:

But how do you create the plots to show the change of the proportions of population of different continents over time?

但是,如何创建显示不同大陆人口比例随时间变化的图?

The author shows you can do exactly so with the plots below

作者显示您可以使用以下图表完全做到这一点

As you can see from the plots above, the time dimension is added to the plot. Now the plots do not only show the distribution but also show how the overall number and distribution change over time! Neat!

从上面的图中可以看到,时间维度已添加到图中。 现在,这些图不仅显示分布,而且还显示总数和分布随时间的变化! 整齐!

Bubble graph

气泡图

As the data points grow across both dimensions, it becomes harder to visualize either the bar chart or stacked bar chart because the size of the bar charts is too small to deliver any meaningful information.

随着数据点在两个维度上的增长,条形图或堆叠条形图的可视化变得越来越困难,因为条形图的尺寸太小,无法传递任何有意义的信息。

That is why it is clever of the author to use a bubble chart, an effective way to see many more data points in one chart but still clearly show the change in proportion over time.

这就是为什么使用气泡图是作者的明智之举,气泡图是一种有效的方法,可以在一个图表中查看更多数据点,但仍能清楚地显示随时间变化的比例。

探索资源 (Resources to Explore)

Altair图解构:可视化天气数据的相关结构 (Altair plot deconstruction: visualizing the correlation structure of weather data)

有趣的主意 (Interesting Idea)

You might know how heatmap is used to show the correlation between different variables in the data, but what if you want to get more insight out of the number .71 in the heatmap? In this article, Paul Hiemstra shows how you could do that by combining heatmap and a 2d histogram to explore the structure of a weather dataset.

您可能知道如何使用热图来显示数据中不同变量之间的相关性,但是如果您想从热图中的数字.71中获得更多的见解,该怎么办? 在本文中Paul Hiemstra展示了如何通过结合使用热图和2d直方图来探索天气数据集的结构。

令人印象深刻的可视化 (Impressive Visualization)

Linked plots with Altair

与Altair关联的地块

Heatmaps and 2d histograms are both effective in showing the correlation — but it would be much more effective if you could combine them both. The linked plots below do just that.

热图和2d直方图都可以有效地显示相关性,但是如果将两者结合使用,效果会更好。 下面的链接图就是这样做的。

As you click each square in the heatmap, you will see 2D histogram representation of that heatmap on the right-hand side!

当您单击热图中的每个正方形时,您将在右侧看到该热图的2D直方图表示!

How does the correlation of .98 look like in 2d histogram? We expect a linear correlation and we prove it by looking at the plot on the right. In contrast, there seems not to be any pattern on the 2D histogram for the correlation of .12. Very intuitive and easy to understand.

.98的相关度在二维直方图中的样子如何? 我们期望线性相关,并通过查看右边的图来证明这一点。 相反,在2D直方图上似乎没有任何与.12相关的模式。 非常直观,易于理解。

探索资源 (Resources to Explore)

具有Python的Plotly的Sankey图基础 (Sankey Diagram Basics with Python’s Plotly)

有趣的主意 (Interesting Idea)

How do you visualize a network with different sources of inflow and outflow? For example, what services are the government’s revenues such as taxes, utilities are spent on? And what is the percentage of expenditure from one service compared to expenditures from other services?

您如何可视化具有不同流入和流出源的网络? 例如,政府的收入用于税收,公用事业等哪些服务? 一项服务的支出与其他服务的支出相比,百分比是多少?

In this article, Thiago Carvalho shows how to visualize such networks effectively with Sankey Diagrams

本文中Thiago Carvalho展示了如何使用Sankey Diagrams有效地可视化此类网络

令人印象深刻的可视化 (Impressive Visualization)

Sankey Diagrams

桑基图

If you click on each network of the diagram, you can see clearly which services the revenues are spent on. If you look solely at the nodes on the left-hand side, you can compare the proportions between different revenues. And you can do the same thing with nodes on the right-hand side. This technique is particularly useful for mapping processes — such as a sales pipeline, or the paths visitors take on your website. It is amazing how one diagram can convey so much information.

如果单击该图的每个网络,则可以清楚地看到收入用于购买哪些服务。 如果仅查看左侧的节点,则可以比较不同收入之间的比例。 您可以对右侧的节点执行相同的操作。 该技术对于映射过程(例如销售渠道或访问者在您网站上采用的路径)特别有用。 令人惊讶的是,一张图可以传达这么多信息。

探索资源 (Resources to Explore)

COVID-19对不同社会群体的美国失业率的影响 (COVID-19’s Impact on U.S. Unemployment Rate of Different Social Groups)

有趣的主意 (Interesting Idea)

You might know badly COVID-19 affects the economy but do you how it affect different social groups? This article by Shinichi Okada aims to answer that question with animated bar chart.

您可能非常了解COVID-19对经济的影响,但是您如何影响不同的社会群体? 冈田慎一的 这篇文章旨在用动画条形图回答这个问题。

令人印象深刻的可视化 (Impressive Visualization)

Animated Bar Chart

动画条形图

The most common way to see the change of bar chart over time is to use a slide bar where you slide the button to see the change yourself. But your rate of sliding changes so you will not see the change over time at the same rate. That is why the animated bar chart is so effective.

查看条形图随时间变化的最常见方法是使用滑动条,在其中滑动按钮可自己查看更改。 但是您的滑动速度会发生变化,因此您将不会看到相同时间的变化。 这就是为什么动画条形图如此有效的原因。

Click the play button on the left-hand side to see how the bar chart changes over time! Now you can clearly see how COVID-19 affects different social groups differently in different time periods.

单击左侧的播放按钮,以查看条形图随时间的变化! 现在,您可以清楚地看到COVID-19在不同时间段内如何不同地影响不同的社会群体。

探索资源 (Resources to Explore)

结论 (Conclusion)

I hope this article provides you a good start to explore interesting medium articles on visualization. The best way to learn anything is to try them yourself. Pick one visualization, run the code, and observe the magic!

我希望本文为您提供一个有趣的中型可视化文章的良好起点。 学习任何东西的最好方法是自己尝试。 选择一个可视化文件,运行代码,然后观察魔术!

I like to write about basic data science concepts and play with different algorithms and data science tools. You could connect with me on LinkedIn and Twitter.

我喜欢写有关基本数据科学概念的文章,并喜欢使用不同的算法和数据科学工具。 您可以在LinkedInTwitter上与我联系。

Star this repo if you want to check out the codes for all of the articles I have written. Follow me on Medium to stay informed with my latest data science articles like these

如果您想查看我编写的所有文章的代码,请给此回购加注星号。 在Medium上关注我,以了解有关这些最新数据科学文章的最新信息

Originally published at https://datapane.com on March 16, 2020.

最初于 2020年3月16日 发布在 https://datapane.com

翻译自: https://towardsdatascience.com/best-python-visualizations-on-medium-a04921f61559

python 可视化工具

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值