dash使用_使用Dash和SHAP构建和部署可解释的AI仪表盘

本文介绍了如何利用Dash和SHAP库构建并部署可解释的人工智能仪表盘,提供了一种理解AI决策过程的方法。
摘要由CSDN通过智能技术生成

dash使用

In recent years, we have seen an explosion in the usage of Machine Learning (ML) algorithms for automating and supporting human decisions. From helping doctors in diagnosing potential diseases that could render diabetic patients blind, to supporting financial analysts in equity management, the impact of ML is irrefutable. However, as algorithms become more sophisticated, it becomes harder to understand how it makes decisions. As a result, various harmful biases could arise in critical situations, such as when ML is used to determine credit limit for credit cards or applied to facial recognition by law enforcement agencies. In order to identify and mitigate the effect of such biases before the model is deployed, it is important to leverage explainable AI (xAI) to better understand which features and factors have the most impact on the final output of the model. This information can then be utilized to help AI developers understand why a certain model might not be performing well in certain scenarios, or just in general. At the same time, such techniques could help us uncover more insights about a problem than simply using black-box models. Ultimately, such techniques can help both technical and non-technical stakeholders better understand and appreciate AI models.

近年来,我们看到了用于自动执行和支持人工决策的机器学习(ML)算法的爆炸式增长。 从帮助医生诊断可能使糖尿病患者失明的潜在疾病 ,到在股权管理方面财务分析师提供支持 ,ML的影响是无可辩驳的。 但是,随着算法变得越来越复杂,变得越来越难以理解它如何做出决策。 结果,在紧急情况下可能会出现各种有害的偏见,例如当使用ML确定信用卡的信用额度或由执法机构将其应用于面部识别时 。 为了在部署模型之前识别并减轻此类偏差的影响,重要的是利用可解释的AI(xAI)更好地了解哪些特征和因素对模型的最终输出影响最大。 然后,可以利用信息来帮助AI开发人员理解为什么某些模型在某些情况下或只是在一般情况下表现不佳。 同时,这些技术可以帮助我们发现更多关于问题的见解,而不仅仅是使用黑匣子模型。 最终,此类技术可以帮助技术和非技术利益相关者更好地理解和欣赏AI模型。

用Shapley值解释黑匣子模型 (Explaining black-box models with Shapley Values)

Many researchers and software developers have been working on the subject of explainable AI systems for many years. One approach, which is called SHapley Additive exPlanations (SHAP), has been increasingly popular in the past few years. It has been used to identify chronic kidney disease, to model US mortality factors and to help anaesthesiologists during surgeries. By applying a game theory concept called Shapley values, you can interpret computer vision models, linear models, tree-based models, and more.

许多研究人员和软件开发人员多年来一直在研究可解释的AI系统。 在过去的几年中,一种称为SHapley Additive ExPlanations(SHAP)的方法越来越受欢迎。 它已被用于识别慢性肾脏疾病 ,模拟美国的死亡率因素在手术过程中帮助麻醉师 。 通过应用称为Shapley值的博弈论概念,您可以解释计算机视觉模型,线性模型,基于树的模型等。

使用Dash构建可投入生产的xAI应用 (Building a production-ready xAI app with Dash)

Image for post
The dashboard was fully built in Python and runs SHAP and LightGBM in real-time. Try it out !
该仪表板完全使用Python构建,并 实时 运行 SHAP LightGBM 试试看

Let’s take, as an example, the task of predicting tips received by waiters based on features such as the total bill, the gender of the payer, the day and time, etc. A black-box model that ingests all those features just to predict the tips could be highly biased and might not be useful for business analysts trying to understand the behavior of the customers. However, by applying SHAP, we can gain more insights on the impact of each feature on the final value predicted by the model, which can be useful for understanding how the model perceives factors prone to biases, such as sex, age, race, etc.

让我们以一个示例为例,该任务根据总账单,付款人的性别,日期和时间等功能来预测服务员收到的小费 。一个黑盒模型会吸收所有这些功能,以便进行预测这些提示可能有很大的偏见,可能对试图了解客户行为的业务分析师没有用。 但是,通过应用SHAP ,我们可以深入了解每个功能对模型预测的最终值的影响,这对于理解模型如何感知容易产生偏见的因素(例如性别,年龄,种族等)很有用。 。

In order to make such xAI methods more accessible and interactive, we used Dash to build an end-to-end application that runs an ensemble model called LightGBM on custom inputs. The dashboard and the full model are deployed on our Dash Enterprise Kubernetes servers, and both the black-box model and SHAP are running in real-time.

为了使此类xAI方法更易于访问和交互,我们使用Dash构建了一个端到端应用程序,该应用程序在自定义输入上运行名为LightGBM的集成模型。 仪表板和完整模型已部署在我们的Dash Enterprise Kubernetes服务器上,黑盒模型和SHAP均实时运行。

In the app, you will find controls that let you control the total bill, sex, day of the week, etc. Each one of those controls defines an input feature, and every time you update them, a new sample is given to the trained LightGBM model.

在应用程序中,您将找到控件,可用于控制总账单,性别,星期几等。这些控件中的每个控件都定义了一种输入功能,每次更新时,都会向受训人员提供一个新示例。 LightGBM模型。

Image for post

However, since LightGBM is a type of gradient boosting method, it is hard to directly interpret it. Therefore, we simulated the controls to allow the app to compute the SHAP values and display them in a waterfall chart.

但是,由于LightGBM是一种梯度增强方法 ,因此很难直接对其进行解释。 因此,我们模拟了控件,以允许该应用计算SHAP值并将其显示在瀑布图中

In addition to specifying custom model inputs, you can also select a random example from the training set. Whenever you do this, you will see the real label appear on the right side (as a scatter point). You can then tweak the feature values to see how the various SHAP values change.

除了指定自定义模型输入之外,您还可以从训练集中选择一个随机示例。 每当您执行此操作时,您都会看到实际标签显示在右侧(作为分散点)。 然后,您可以调整要素值以查看各种SHAP值如何变化。

Moreover, you can also decide to make binary predictions (e.g. the sex of the customer) and interact with the graph using the Plotly modebar.

此外,您还可以决定进行二进制预测(例如,客户的性别),并使用Plotly模式栏与图表进行交互

Image for post
Image for post

缩小Python和高级分析之间的差距 (Bridging the gap between Python and advanced analytics)

The current state-of-the-art ML algorithms (e.g. gradient boosting and neural networks) for modeling continuous and categorical features are usually written in optimized C/C++ codes, but they can be conveniently used through Python. As a result, powerful xAI libraries like SHAP are also interfaced in the same language, which lets us train and explain powerful models in just a few lines of Python code. However, although such models are popular in the ML community, considerable effort needs to be made to port them into traditional BI tools, including having to connect external servers and add third-party extensions. Furthermore, building UIs that lets you train these Python-based ML libraries can quickly become cumbersome if you are using those BI tools.

当前用于建模连续和分类特征的最新ML算法(例如,梯度增强和神经网络)通常以优化的C / C ++代码编写,但是可以通过Python方便地使用它们。 结果,像SHAP这样的功能强大的xAI库也以相同的语言进行接口,这使我们仅用几行Python代码就可以训练和解释功能强大的模型。 但是,尽管这样的模型在ML社区中很流行,但是需要付出很大的努力才能将它们移植到传统的BI工具中,包括必须连接外部服务器并添加第三方扩展。 此外,如果使用那些BI工具,构建可让您训练这些基于Python的ML库的UI很快就会变得很麻烦。

With Dash, you can seamlessly integrate popular and up-to-date ML libraries, which enable Dash app users to quickly answer “what if?” questions, and probe what the ML models have learned from the data. Most of the time, all you need to do is to install and freeze such libraries using pip, which is usually done in a few lines:

借助Dash,您可以无缝集成流行的和最新的ML库,这使Dash应用程序用户可以快速回答“如果...怎么办”。 问题,并探讨ML模型从数据中学到了什么。 大多数时候,您需要做的就是使用pip安装和冻结此类库,通常只需几行即可完成:

pip install dash shap lightgbm
pip freeze > requirements.txt

When you are ready to share your dashboard, all the dependencies and deployment are handled by the Dash Enterprise App Manager.

当您准备共享仪表板时,所有依赖项和部署都由Dash Enterprise App Manager处理

获得超越黑匣子预测的见解 (Gaining insights beyond black-box predictions)

A classical argument against the use of more advanced models is that, although they can improve the accuracy, the complexity makes them harder to interpret. With linear regression, you can use the coefficients to judge which features weigh more than others for making a prediction; in the case of decision trees, you can visualize how the tree splits and set thresholds for deciding the output. In the case of deep neural networks and ensemble models, you can’t visualize the tree structure nor the coefficients; however, with SHAP, it’s possible to not only explain how features generally affect the model, but also how each feature value influences the output in a specific example. For example, the model might believe that female customers tend to tip more when they are going out with a friend for dinner (on a Saturday) than when they are grabbing lunch alone (on a Thursday).

反对使用更高级的模型的经典论据是,尽管它们可以提高准确性,但其复杂性使其难以解释。 通过线性回归,您可以使用系数来判断哪些特征比其他特征权重更大; 对于决策树,您可以可视化树的拆分方式并设置用于确定输出的阈值 。 在深度神经网络和集成模型的情况下,您无法可视化树结构或系数。 但是,使用SHAP,不仅可以说明要素通常如何影响模型,还可以在特定示例中说明每个要素值如何影响输出。 例如,该模型可能认为,女性顾客与朋友外出吃饭(星期六)时比他们单身午餐(星期四)时要多付小费。

Image for post
Image for post
Left: Thursday lunch alone. Right: Saturday Dinner with friends.
左:仅星期四午餐。 右:与朋友共进晚餐。

Such insight could either lead to a stronger understanding of customer behavior if it is backed by additional studies, or it could reveal some degree of systematic bias that would not have been otherwise uncovered without SHAP. With Dash, we make it easier to build and deploy custom dashboards that let you interpret all sorts of ML models, whether they are trained on predicting tips, or other types of data.

如果有其他研究的支持,则这种洞察力可能导致对客户行为的更深入了解,或者可能揭示某种程度的系统偏差,而如果没有SHAP,这是无法发现的。 借助Dash,我们可以更轻松地构建和部署自定义仪表板,使您可以解释各种ML模型,无论它们是根据预测技巧还是其他类型的数据进行训练

将Python的力量掌握在业务用户手中 (Putting the power of Python in the hands of business users)

At Plotly, we are working on keeping Dash flexible, yet easy-to-use for building ML apps and dashboards. For this reason, we built our app with components from the Dash Enterprise Design Kit, which makes it easy to tailor apps to meet style guidelines without delving into HTML and CSS. For example, if you don’t like the color of the bars in the default waterfall charts, you can easily change it with the Design Kit Theme Editor.

在Plotly,我们致力于保持Dash的灵活性,但易于使用,用于构建ML应用程序和仪表板。 因此,我们使用Dash Enterprise Design Kit中的组件构建了应用程序,从而可以轻松定制应用程序以满足样式准则,而无需研究HTML和CSS。 例如,如果您不喜欢默认瀑布图中的条形颜色,则可以使用设计工具包主题编辑器轻松更改它。

Image for post

Furthermore, new features in Dash like pattern-matching callbacks let you simplify the process of creating callbacks. As a result, you can create very complex callbacks between components with very little effort. As an example, there are six controls in our app (one for each input feature), but we only need to specify one Input and one State to our callback to control all the components at the same time:

此外,Dash中的新功能(例如模式匹配回调)使您可以简化创建回调的过程。 结果,您可以毫不费力地在组件之间创建非常复杂的回调。 例如,我们的应用程序中有六个控件(每个输入功能一个),但是我们只需为回调指定一个Input和一个State即可同时控制所有组件:

Image for post

Then, in one line, we were able to construct a dictionary where the keys are the feature names and the values are what we will input to the model. From then on, it’s easy to process the dictionary in the input format most suitable to your model.

然后,我们可以在一行中构造一个字典,其中的键是要素名称,而值是我们将输入到模型的值。 从那时起,很容易以最适合您的模型的输入格式处理字典。

Are you interested in creating similar apps using state-of-the-art models and xAI algorithms? Contact us to learn more about how Dash Enterprise can help you build, design, and deploy ML dashboards — with no compromise.

您是否对使用最新模型和xAI算法创建类似应用感兴趣? 请与我们联系,以了解有关Dash Enterprise如何能够帮助您构建,设计和部署ML仪表板的更多信息,而不会做出任何妥协。

翻译自: https://medium.com/plotly/building-and-deploying-explainable-ai-dashboards-using-dash-and-shap-8e0a0a45beb6

dash使用

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值