android 破折号_使用破折号来试验心脏病的预测模型

android 破折号

背景 (Background)

In a previous article a model for predicting heart disease was developed using PyCaret. Now, assume we wish to pilot this model in a clinical setting. The first thing we need is a UI for the model so it can then be deployed. We will also assume that the tool is for screening to recommend follow-up actions and not diagnosis.

上一篇文章中 ,使用PyCaret开发了一种预测心脏病的模型。 现在,假设我们希望在临床环境中试用该模型。 我们需要的第一件事是模型的UI,以便可以对其进行部署。 我们还将假定该工具是用于筛查的,以建议采取后续行动而不是诊断。

For the purposes of illustration in this example, we will focus on the following:

为了便于说明本示例,我们将重点关注以下内容:

  1. Identifying minimum key requirements and how to address them.

    确定最低关键要求以及解决方法。
  2. Sketching out a UI that incorporates key requirements.

    绘制出包含关键需求的UI。
  3. Creating a working version using Dash.

    使用Dash创建工作版本。

The application was developed using PyCharm and all materials can be found on my GitHub.

该应用程序是使用PyCharm开发的,所有材料都可以在我的GitHub上找到

Disclaimer: in a real-world setting when a new predictive model is deployed to prevent, diagnose, or treat health conditions, numerous considerations including demonstrated impact, ethics and algorithm bias, and legislative requirements (i.e., SaMD, patient privacy) apply. While these considerations are extremely important, they are beyond the scope of this article.

免责声明:在实际环境中,当部署新的预测模型来预防,诊断或治疗健康状况时,应考虑多种因素,包括已证明的影响,道德和算法偏见以及法律要求(例如, SaMD ,患者隐私)。 尽管这些考虑因素极其重要,但它们超出了本文的范围。

要求 (Requirements)

In collaboration with clinical colleagues suppose the following requirements were identified:

与临床同事合作,假设确定了以下要求:

  1. A place to input the patient characteristics used as features in the predictive model.

    输入患者特征的位置,该特征用作预测模型中的特征。
  2. An output risk score for the patient that also assigns them to a risk group.

    患者的输出风险评分,也将他们分配给风险组。
  3. Assist the heath care practitioner in understanding/communicating what risk factors are contributing to the patient’s risk. We will use SHAP values for this.

    协助健康护理从业人员了解/传达哪些风险因素正在助长患者的风险。 我们将为此使用SHAP值

  4. Recommended next actions based on the risk group the patient falls into.

    根据患者所属的风险类别,建议采取的下一步措施。
  5. Information about how the model being used to generate the predicted risk score was developed and its performance. We can also include some general information about the cohort and important risk factors.

    有关如何使用模型生成预测风险评分的信息及其性能。 我们还可以包括有关队列和重要危险因素的一些常规信息。

用户界面 (The user interface)

Based on the requirements above, a simple structure with the following sections is a reasonable starting point:

根据上述要求,具有以下部分的简单结构是一个合理的起点:

  1. Data input (requirement 1)

    数据输入(要求1)
  2. Predictive model outputs/interpretation/recommended next actions (requirements 2, 3 and 4)

    预测模型输出/解释/建议的后续操作(需求2、3和4)
  3. Model development information (requirement 5)

    模型开发信息(要求5)

A rough layout for our Dash app is shown in Figure 1.

图1显示了Dash应用程序的粗略布局。

Image for post
Figure 1. Rough sketch outline of what we want to create using Dash. Please excuse the handwriting! (Image by Author) 图1.我们要使用Dash创建的草图轮廓。 请原谅手写! (图片由作者提供)

Dash应用程序 (The Dash application)

Based on the requirements and sketch from above, the first working application is shown in Figure 2. Please note the screenshots are from viewing on a wide screen (24-Inch, 16:9, 1920 x 1080). The alignment may be off on smaller or non-wide screens. I highly recommend cloning the GitHub repo and exploring the UI yourself!

根据上面的要求和草图,第一个工作应用程序如图2所示。 请注意,屏幕截图是在宽屏(24英寸,16:9、1920 x 1080)上观看的。 在较小或非宽屏上,对齐方式可能不正确。 我强烈建议您克隆GitHub存储库并亲自探索UI!

Image for post
Figure 2. The main view! (Image by Author) 图2.主视图! (图片由作者提供)

数据输入 (Data input)

On left-hand side of the application main view we have appropriate mechanisms for a user to input the required information to generate the features for the predictive model (Figure 3). Drop-down menus are used where possible (all categorical features) otherwise values can be entered directly for numeric features.

在应用程序主视图的左侧,我们有适当的机制供用户输入所需的信息以生成预测模型的功能( 图3 )。 尽可能使用下拉菜单(所有分类功能),否则可以直接为数字功能输入值。

Image for post
Figure 3. Main view left hand panel which covers data entry for the predictive model. (Image by Author) 图3.左侧面板的主视图,其中涵盖了预测模型的数据输入。 (图片由作者提供)

Typically, the data preprocessing pipeline would be embedded in the application to handle steps such as imputation or scaling. In this case due to the minimal preprocessing and the fact that in all but one case the data entered is the feature value, I opted to keep it simple.

通常,数据预处理管道将嵌入应用程序中以处理诸如插补或缩放之类的步骤。 在这种情况下,由于最少的预处理,而且除了一种情况外,在所有情况下输入的数据都是特征值,因此我选择使其保持简单。

预测模型输出/建议措施 (Predictive model outputs/recommended actions)

The right-hand side of the application summarizes the required outputs based on our predictive model (Figure 4). First, we get a predicted risk score, which is used to place the patient into one of three risk groups.

该应用程序的右侧根据我们的预测模型总结了所需的输出( 图4 )。 首先,我们获得了预测的风险评分,该评分用于将患者分为三个风险组之一。

Image for post
Figure 4. Predicted risk summary and recommended actions. (Image by Author) 图4.预测的风险摘要和建议的操作。 (图片由作者提供)

Next SHAP values and a waterfall plot are used to convey which patient factors are contributing most to risk. The idea is that it will help the user to see which factors the model thinks is contributing to heart disease risk, and if modifiable, could be targeted for lifestyle interventions. Of course, this comes with the usual association not causation caveat. The shap package waterfall plot (by Scott Lundberg) isn’t easily incorporated into a dash application so I created my own replica using the plotly waterfall plot.

下一个SHAP值和瀑布图用于传达哪些患者因素是造成风险最大的因素。 其想法是,它将帮助用户查看该模型认为哪些因素导致了心脏病风险,并且如果可以修改,可以将其作为生活方式干预的目标。 当然,这是通常的关联而不是因果关系的警告。 shap软件包的瀑布图(由Scott Lundberg编写 )不容易合并到dash应用程序中,因此我使用了曲线图瀑布图创建了自己的副本。

Finally, there is a (fictional) recommendation of actions based on which of the three risk groups a patient falls into. The idea is to prompt a set of sensible next steps whether it is lifestyle changes, further testing, or follow-up appointments. This could also be used as a place to reference approved guidelines for prevention/treatment of a medical condition or as a starting point for referral to appropriate services (i.e., refer to nutrition counselling). The risk grouping and corresponding actions should be developed in collaboration with clinical colleagues.

最后,根据患者所属的三个风险组中的哪一个,提出(虚构的)行动建议。 这个想法是要提示一组明智的后续步骤,无论是生活方式的改变,进一步的测试还是后续的约会。 这也可以用作参考已批准的预防/治疗疾病的指南的地方,或作为转介到适当服务的起点(例如,参考营养咨询)。 应与临床同事合作制定风险分组和相应的措施。

型号信息 (Model information)

In this section we added information about the data used, selected model, training/tuning process, performance metrics, study cohort table descriptively comparing those with and without heart disease, and SHAP-based feature importance (Figure 5).

在本部分中,我们添加了有关使用的数据,所选模型,训练/调整过程,性能指标,描述性比较有或没有心脏病的队列表以及基于SHAP的功能重要性的信息( 图5 )。

Image for post
Figure 5. The expandable panel at the bottom of the screen with basic information about model development. (Image by Author) 图5.屏幕底部的可扩展面板,其中包含有关模型开发的基本信息。 (图片由作者提供)

We could also consider including the average characteristics of the entire study cohort, external validation procedures or other studies if applicable, references to published work supporting the tool, highlight any feature interactions etc.

我们还可以考虑包括整个研究队列的平均特征,外部验证程序或其他研究(如果适用),对支持该工具的已发表著作的引用,突出显示任何功能相互作用等。

改善的潜力 (Potential for improvement)

As always, any pilot UI has potential for improvement, for example:

与往常一样,任何试验UI都有改进的潜力,例如:

  1. Simplified data entry: We could have reduced the information needed to enter as some features have no impact on predicted risk. I kept everything so others who develop their own model using the same dataset can simply swap the model .pkl files and use the dashboard as is.

    简化的数据输入:我们可以减少输入所需的信息,因为某些功能不会影响预期的风险。 我保留了所有内容,以便其他使用相同数据集开发自己的模型的人可以简单地交换模型.pkl文件并按原样使用仪表板。

  2. Automated data feed: We could use file upload or direct database connection instead of manual data entry. But what we have will suffice for the example and would not be so over-burdensome as part of a pilot.

    自动数据馈送:我们可以使用文件上传或直接数据库连接来代替手动数据输入。 但是我们拥有的示例就足够了,并且作为飞行员的一部分不会太繁重。

  3. Model calibration: Using predicted risk score strata is useful for recommending follow-up actions, as was the case in this example. If we report the likelihood of heart disease to a patient, our predicted probability must be well calibrated (i.e., aligned with the actual chance that heart disease occurs).

    模型校准:如本例所示,使用预测的风险评分层次可用于推荐后续措施。 如果我们向患者报告发生心脏病的可能性,则我们的预测概率必须经过良好校准(即与发生心脏病的实际机会保持一致)。

What other improvements would you make, or what you might have done differently?

您还会进行哪些其他改进,或者可能会做不同的事情?

摘要 (Summary)

We got the dashboard up and running for our pilot. Using Dash is a bit more towards the custom end of the deployment spectrum. This could have been deployed via other tools such as PowerBI or Qlik as possible mid-range options. We could also take a lower end approach and create an API using for example FastAPI. In a future article I will attempt to recreate the Dash application using these other UI options.

我们启动了仪表板并为我们的飞行员运行。 使用Dash会更接近于部署范围的定制范围。 可以通过其他工具(例如PowerBI或Qlik)将其部署为可能的中档选项。 我们也可以采用低端方法,并使用例如FastAPI创建API。 在以后的文章中,我将尝试使用其他UI选项重新创建Dash应用程序。

Thanks for reading, and I hope you spend some time exploring the Dash UI from this example. As always, comments, thoughts, feedback, and discussion are very welcome!

感谢您的阅读,我希望您花一些时间从此示例中探索Dash UI。 与往常一样,我们非常欢迎评论,想法,反馈和讨论!

翻译自: https://towardsdatascience.com/using-dash-to-pilot-a-predictive-model-for-heart-disease-a1dab01035ac

android 破折号

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值