python打印星图_Python中的星图

这篇博客介绍了如何使用Python来打印或绘制星图,参考了来自Towards Data Science的文章,提供了相关代码示例。
摘要由CSDN通过智能技术生成

python打印星图

Diamonds are a data scientist’s best friend. More specifically, the diamond dataset found on Kaggle. In this article, I will walk through a simple workflow to create a Star Chart (aka Spider Chart or Radar Charts). This tutorial was adapted from the wonderful workflow of Alex at Python Charts. All of the code in this article and the needed dataset are available on GitHub.

钻石是数据科学家的最好朋友。 更具体地说, 在Kaggle上找到的菱形数据集。 在本文中,我将通过一个简单的工作流程来创建星图(又名蜘蛛图或雷达图)。 本教程改编自Python ChartsAlex出色的工作流程。 GitHub上提供本文中的所有代码和所需的数据集。

钻石是数据科学家的最好朋友 (Diamonds are a Data Scientist’s Best Friend)

To begin you will need a few libraries. I am running Python 3. I created this workflow using a Jupyter notebook, pandas, matplotlib, and numpy. These packages can be installed via pip or conda if they are not already on your system.

首先,您将需要一些库。 我正在运行Python3。我使用Jupyter笔记本,pandas,matplotlib和numpy创建了此工作流程。 如果这些软件包尚未安装在系统上,则可以通过pip或conda进行安装。

pip install jupyterlab
pip install pandas
pip install matplotlib
pip install numpy

The dataset can be downloaded from Kaggle and should be around 3.2 MB. I have included a copy of the dataset on the Github. I have the dataset in the data folder. Load the dataset with pandas, drop the extra index column, and we are off!

数据集可以从Kaggle下载,并且大约为3.2 MB。 我已经在Github上包含了数据集的副本。 我在数据文件夹中有数据集。 用大熊猫加载数据集,删除多余的索引列,我们关闭了!

df = pd.read_csv("data/diamonds.csv")
df.drop("Unnamed: 0", axis=1, inplace=True)

3 C的电平 (Levels of the 3 C’s)

The 4 C’s of diamonds are Cut, Color, Clarity, and Carat. Cut, Color, and Clarity are defined as categorical variables used in the diamond industry. Carat is a numeric representing the weight of a stone.

钻石的4 C分别是切工,颜色,净度和克拉。 切工,颜色和净度被定义为钻石行业中使用的分类变量。 克拉是代表宝石重量的数字。

Image for post
Photo by Hao Zhang on Unsplash
张皓在《 Unsplash》上的 照片

To create the Star chart, we need to represent the diamond industry terms as numerics. To do this we need to gather information about the levels that are in our dataset. Cut is composed five levels with Ideal being the highest [4] and Fair being the lowest level [0]. In the seven levels of Color, D is the highest [6] and J is the lowest level [0]. Finally, Clarity is composed of eight levels with IF, meaning internally flawless as the highest level [7] and I1, inclusions level 1, as the lowest level [0].

要创建星图,我们需要将钻石行业术语表示为数字。 为此,我们需要收集有关数据集中级别的信息。 Cut分为五个级别,其中“ 理想 ”级别最高[4],而“ 公平 ”级别最低[0]。 在七个颜色级别中, D是最高级别[6], J是最低级别[0]。 最后,清晰度由IF的八个级别组成这意味着内部无瑕疵为最高级别[7],而I1为内含物,级别1为最低级别[0]。

显示的切割和抛光数据 (Cutting and Polishing Data for Display)

In our dataset, we cut 3 outliers in carat size that skew the downstream column scaling.

在我们的数据集中,我们切下了3个离群值,这些值偏离了下游列的缩放比例。

## Cut diamonds that skew carat range
indicies_to_remove = [27415, 27630, 27130]
df = df.drop(indicies_to_remove)

Next, we create new columns in our dataframe to house the rankings created by mapping a dictionary against our C’s column. An example of the mapping is below.

接下来,我们在数据框中创建新列,以容纳通过将字典映射到C列所创建的排名。 映射的示例如下。

cut={'Ideal':4,'Premium':3,'Very Good':2,'Good': 1,'Fair':0}
df['Cut'] = df['cut'].map(cut) #Note: 'Cut' is a different column

Finally, we need to scale the columns that we will use in our Star Chart to represent the data fairly.

最后,我们需要缩放将在星图中使用的列以公平地表示数据。

## Convert all rankings and contiguous data to scale between 0-100
factors = ['Cut', 'Color', "Clarity", "carat", "price"]new_max = 100
new_min = 0
new_range = new_max - new_min## Create Scaled Columns
for factor in factors:
max_val = df[factor].max()
min_val = df[factor].min()
val_range = max_val - min_val
df[factor + '_Adj'] = df[factor].apply(lambda x: (((x - min_val) * new_range) / val_range) + new_min)

We then subset the scaled columns for downstream plotting. Notice how we are creating a new dataframe (df2) with only the columns we intend to use in the Star Chart.

然后,我们将缩放列的子集用于下游绘图。 请注意,我们如何仅使用打算在星形图表中使用的列来创建新的数据框(df2)。

## Subset scaled columns 
df2 = df[['Cut_Adj', "Color_Adj", "Clarity_Adj", "carat_Adj", "price_Adj"]]
df2.columns = ['Cut', "Color", "Clarity", "Carat", "Price"]

表演之星 (The Star of the Show)

To create the Star Chart, we must specify which columns to use and create the circular plot object using numpy.

要创建星形图,我们必须指定要使用的列,并使用numpy创建圆形图对象。

labels = ['Cut', "Color", "Clarity", "Carat", "Price"]
points = len(labels)angles = np.linspace(0, 2 * np.pi, points, endpoint=False).tolist()
angles += angles[:1]

We then create a helper function to plot a diamond solely by the index number.

然后,我们创建一个辅助函数,以仅通过索引号绘制菱形。

def add_to_star(diamond, color, label=None):
values = df2.loc[diamond].tolist()
values += values[:1]
if label != None:
ax.plot(angles, values, color=color, linewidth=1, label=label)
else:
ax.plot(angles, values, color=color, linewidth=1, label=diamond)
ax.fill(angles, values, color=color, alpha=0.25)

Now the magic begins! We can begin populating our Star Chart with any diamonds we want. How about the most expensive and the two least expensive:

现在魔术开始了! 我们可以开始用我们想要的任何钻石填充星图。 最贵的和最便宜的两个分别如何:

## Create plot object   
fig, ax = plt.subplots(figsize=(6, 6), subplot_kw=dict(polar=True))## Plot a new diamond with the add_to_star function
add_to_star(27749, '#1aaf6c', "Most Expensive Diamond")
add_to_star(0, '#429bf4', "Least Expensive A")
add_to_star(1, '#d42cea', "Least Expensive B")

This amount is enough to create a Star Chart, however, there are no x labels, no orientation, and no custom flair. Let’s change that!

此数量足以创建星图,但是,没有x标签,没有方向,也没有自定义样式。 让我们改变它!

## Fix axis to star from top
ax.set_theta_offset(np.pi / 2)
ax.set_theta_direction(-1)## Edit x axis labels
for label, angle in zip(ax.get_xticklabels(), angles):
if angle in (0, np.pi):
label.set_horizontalalignment('center')
elif 0 < angle < np.pi:
label.set_horizontalalignment('left')
else:
label.set_horizontalalignment('right')## Customize your graphic# Change the location of the gridlines or remove them
ax.set_rgrids([20, 40, 60 ,80])
#ax.set_rgrids([]) # This removes grid lines# Change the color of the ticks
ax.tick_params(colors='#222222')
# Make the y-axis labels larger, smaller, or remove by setting fontsize
ax.tick_params(axis='y', labelsize=0)
# Make the x-axis labels larger or smaller.
ax.tick_params(axis='x', labelsize=13)# Change the color of the circular gridlines.
ax.grid(color='#AAAAAA')
# Change the color of the outer circle
ax.spines['polar'].set_color('#222222')
# Change the circle background color
ax.set_facecolor('#FAFAFA')# Add title and legend
ax.set_title('Comparing Diamonds Across Dimensions', y=1.08)
ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1))# Draw axis lines for each angle and label.
ax.set_thetagrids(np.degrees(angles), labels)

So what is the output?

那么输出是什么?

Image for post
A Star Chart of the most expensive and the 2 least expensive diamonds. Carets seem to be a large driver of price.
最贵和最便宜的2颗钻石的星图。 Carets似乎是价格的主要驱动力。

最好的金光闪闪 (Best Bling for Your Buck)

What is the diamond with the highest rating across the 4 C’s with the lowest price? To find out we must get the total value across the 4 C’s and divide by the raw (unscaled) price. This section operates over the original dataframe with all the raw columns. To find the total, we sum the four scaled columns.

在价格最低的4 C钻石中评级最高的钻石是什么? 为了找出答案,我们必须获得4 C的总价值,然后除以原始(未定标)价格。 本节将对具有所有原始列的原始数据框进行操作。 为了找到总数,我们对四个比例列进行求和。

df['Total'] = df['Cut_Adj'] + df['Color_Adj'] + df['Clarity_Adj'] + df['carat_Adj']## Divide Value total by Price
df['4C_by_Price'] = df['Total']/df['price']
df = df.sort_values(by="4C_by_Price", ascending=False)

The diamond with the most bling for our buck is #31597 and the diamond with the least bling for our buck is #26320. How do these diamonds compare on a Star Chart? Let’s explore below:

对我们的降压效果最高的钻石是#31597,而对我们的降压效果最少的钻石是#26320。 这些钻石与星图相比如何? 让我们探索以下内容:

Image for post
One is diamond of a deal, the other is a chunk of carbon.
一个是交易的钻石,另一个是大块的碳。

结论: (Conclusions:)

Thank you for exploring a few diamond characteristics in a Star Chart format using matplotlib. If you have any questions post them below or to the location of the full code, the GitHub repository. My name is Cody Glickman and I can be found on LinkedIn. Be sure to check out some other articles about fun data science projects!

感谢您使用matplotlib探索星图格式的一些钻石特征。 如果您有任何疑问,请在下面或完整代码的位置发布GitHub存储库 。 我叫Cody Glickman ,可以在LinkedIn上找到 请务必查看其他有关有趣的数据科学项目的文章!

翻译自: https://towardsdatascience.com/stars-charts-in-python-9c20d02fb6c0

python打印星图

python开发的真实星空显示软件 含真实恒星位置数据3144颗 代码讲解见: https://blog.csdn.net/xiaorang/article/details/106598307 数据格式例: {'long': 0.023278328898474372, 'lat': -0.09961466705757636, 'light': 46, 'const': 66}, {'long': 0.024870941840919196, 'lat': 0.2338062439126301, 'light': 55, 'const': 62}, {'long': 0.028107061526797, 'lat': 1.1204335039257496, 'light': 56, 'const': 18}, {'long': 0.03660100303760025, 'lat': 0.5077259659824991, 'light': 21, 'const': 1}, {'long': 0.04004802831028905, 'lat': 1.0323574005393255, 'light': 23, 'const': 18}, {'long': 0.03944444109507185, 'lat': 0.3178583859888262, 'light': 55, 'const': 62}, {'long': 0.040797071265367454, 'lat': -0.488478858963941, 'light': 54, 'const': 74}, {'long': 0.0410661312228549, 'lat': -0.798444499556106, 'light': 39, 'const': 64}, {'long': 0.043800486202076855, 'lat': 0.1945266317121166, 'light': 55, 'const': 66}, {'long': 0.045036755271142, 'lat': 0.804111967609767, 'light': 50, 'const': 1}, {'long': 0.043785947609407745, 'lat': -1.4350775693910554, 'light': 53, 'const': 58}, {'long': 0.04915283505929031, 'lat': -0.2699684886295715, 'light': 49, 'const': 21}, {'long': 0.050498187206605094, 'lat': -0.4851966800391031, 'light': 54, 'const': 74}, {'long': 0.05119631890740283, 'lat': -0.6131874860342564, 'light': 52, 'const': 74}, {'long': 0.05775584219505068, 'lat': 0.26500400429202875, 'light': 28, 'const': 62}, {'long': 0.05896303407877759, 'lat': 0.7162006931179011, 'light': 57, 'const': 1}, {'long': 0.06371905629046214, 'lat': 0.3526728525507925, 'light': 48, 'const': 62}, {'long': 0.06387905062299246, 'lat': -0.33043929519585447, 'light': 44, 'const': 21}, 代码解说详细的教程见: https://blog.csdn.net/xiaorang/article/details/106598307
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值