如何评估机器学习模型有效性_机器学习模型的实现:跨模型评估变量重要性

如何评估机器学习模型有效性

Great! So you completed your Machine Learning (ML) pipeline, hopefully choosing your tuning parameters for each model through a careful validation procedure and have converged on what you think is the final model for each strategy. What’s next? In the first part of this series I’ll talk about some metrics and graphics beyond the area under the ROC curve that can be helpful in achieving our project’s goals. Sometimes as data scientists our tunnel vision for the data and modeling, or at least others’ perception of us in this way, can limit our potential effectiveness as communicators and strategizers. The elements I’ll demonstrate in these articles will help you see the problems you work on through your collaborator’s eyes and provide concrete ways for you to address their concerns and objectives for your ML work. Specifically, in this article we’ll:

大! 因此,您完成了机器学习(ML)流程,希望通过仔细的验证过程为每种模型选择调整参数,并已集中于您认为是每种策略的最终模型。 下一步是什么? 在本系列的第一部分中,我将讨论ROC曲线下方区域之外的一些指标和图形,这些指标和图形可有助于实现我们的项目目标。 有时,作为数据科学家,我们对数据和建模的隧道愿景,或者至少其他人以这种方式对我们的看法,可能会限制我们作为沟通者和战略制定者的潜在效力。 我将在这些文章中演示的元素将帮助您通过协作者的眼光看到正在解决的问题,并为您提供具体的方法来解决他们对ML工作的关注和目标。 具体来说,在本文中,我们将:

  1. Discuss why a unifying framework to explore Variable Importance in Machine Learning is, well, important!

    讨论为什么探索机器学习中变量重要性的统一框架非常重要!
  2. Display R code to produce a Variable Importance plot using the results from a group of ML models.

    使用一组ML模型的结果显示R代码以生成变量重要性图。

变量重要性的重要性 (The Importance of Variable Importance)

Hopefully, you are sold already on the importance of trying different approaches when you tackle a prediction problem. Fitting models of varying complexity (e.g. both linear and non-linear) can identify the most parsimonious model that gives good prediction performance for the data we’re working with. However, the flip side to the advantages that come with applying different models can be the apparent difficulty in comparing the results across these divergent strategies. What about the predictors that were used to fit the models — how can these be evaluated to determine what was important? As a data scientist, I challenge you to find a subject matter expert who, although you may deliver a model that predicts quite well, has no curiosity about what is happening under the hood! Your final model may have a straightforward interpretation of the parameter coefficients, such as for a linear model, but this is not always the case for more complex non-linear strategies. Luckily, although models have different parameterization and optimization strategies, Relative Variable Importance is a unifying concept that can allow comparisons to be made concerning features across models. More details on how Importance is computed for different models can be found in Applied Predictive Modeling (

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值