机器学习规则学习_使用机器学习发现动作规则

机器学习规则学习

In this article we are going to cover:

在本文中,我们将介绍:

  • Interpretable machine learning, correlation vs. causation, use cases

    可解释的机器学习,相关性与因果关系,用例
  • A powerful python package combining prediction and causal inference, an end-to-end Action Rules Discovery model

    强大的python程序包,结合了预测和因果推理,端到端的“动作规则发现”模型

Links to my other articles

链接到我的其他文章

  1. Deep Kernels and Gaussian Processes

    深核和高斯过程

  2. Custom Loss Functions in TensorFlow

    TensorFlow中的自定义损失函数

  3. Prediction and Inference with Boosted Trees

    Boosted Trees的预测和推断

  4. Softmax classification

    Softmax分类

  5. Climate analysis

    气候分析

Introduction

介绍

Suppose you are a Data Magician working at a business and were tasked to come up with a churn prediction model, to predict which customers are at risk of unsubscribing from the service that your business offers. You quickly spin up a deep neural network on some P100 GPUs on Colab Pro and get some high prediction accuracies, have a kombucha and then call it a day. The next afternoon your boss calls you over a Zoom and says that although the prediction capabilities are useful, she needs to know what factors contribute to the churn probability and by how much.

假设您是一家业务部门的数据魔术师,并且被要求提出一个客户流失预测模型,以预测哪些客户有可能无法订阅您的业务所提供的服务。 您可以在Colab Pro的某些P100 GPU上快速建立一个深层神经网络,并获得较高的预测精度,进行康普茶,然后将其称为“一天”。 第二天下午,您的老板在Zoom上给您打电话,并说尽管预测功能很有用,但她需要知道哪些因素会导致客户流失几率以及影响多少。

You revise the model and instead of deep neural nets, go with ensembles of LightGBMs combined with Shapley values computed over the individual boosted trees to give factor inference outputs like the following:

您修改模型,而不是深层神经网络,而要使用LightGBM的合体,并结合在单个增强树上计算出的Shapley值,以提供如下的因子推断输出:

Image for post
https://github.com/slundberg/shap under license to Haihan Lan https://github.com/slundberg/shap已获得Haihan Lan的许可

You have another kombucha, satisfied at having discovered (quite robustly too) what factors the model thinks contribute to churn, and call it a day.

您还有另一个康普茶,对发现(也非常有力地)模型认为导致客户流失的因素感到满意,并将其称为“一天”。

Just before you get off Friday afternoon, your boss Zoom calls you again and tells you that although the factor inference insights are great, we still don’t know if these factors actually cause churn or retention. She reminds you that correlation does not imply causation and that she can’t tell which factors suggested by the model are actually causative vs being coincidences or spurious correlations.

就在星期五下午下班前,老板Zoom再次打电话给您,并告诉您,尽管因素推断的见解很棒,但我们仍然不知道这些因素是否真正导致客户流失或保留。 她提醒您,相关性并不表示因果关系,并且她无法说出模型建议的哪些因素实际上是因果性,巧合或虚假的相关性。

Image for post
An example of a spurious correlation
虚假相关的一个例子

Being the obsessive Data Magician you are, you scour the interwebs and stumble upon a hidden gem that does everything your boss asked for.

身为痴迷的数据魔术师,您会搜寻网络并偶然发现一个隐藏的宝石,它可以完成老板所要求的一切。

There is a branch of supervised machine learning called Uplift modelling that deals with answering questions like “How much will X intervention/action affect outcome Y?” given a dataset of historical records that contain data regarding intervention X and outcome Y. The effect of X on Y is a quantity (typically a percentage if Y is a probability) called uplift. In this article we are going to briefly cover a package called actionrules created by [1] and how to apply it to discover action rules and quantify their impact on an outcome.

监督机器学习的一个分支称为Uplift建模,用于回答诸如“ X干预/行动将对结果Y产生多大影响?”之类的问题。 给定历史记录的数据集,其中包含有关干预X和结果Y的数据。X对Y的影响是一个称为uplift的数量(如果Y是概率,则通常为百分比)。 在本文中,我们将简要介绍由[1]创建的名为actionrules的程序包,以及如何将其应用于发现操作规则并量化其对结果的影响。

Action Rules

行动规则

We will cover briefly some definitions regarding classification rules, actions rules, support, confidence and look at how uplift is estimated.

我们将简要介绍有关分类的一些定义 规则行动 规则支持信心 并看看如何估计隆起。

Classification rules r_n are defined as:

分类规则r_n定义为:

r_n =[(X_1, n ∧ X_2, n ∧ … ∧ X_m, n ) → Y_n]

r _n = [(X_1,n X_2,n…X_m,n)→Y_n]

Where the tuple (X_1, n ∧ X_2, n ∧ … ∧ X_m, n ) are particular values X_m from n input columns. This tuple is called the antecedent or ant and the outcome Y_n is called the consequent. For For example:

其中元组(X_1,n X_2,n…X_m,n)是来自n个输入列的特定值X_m。 该元组称为先行或蚂蚁,结果Y_n称为结果。 例如:

[(Age= 55 ∧ Smoking = Yes ∧ Weight = 240 lbs) → risk of heart disease =Yes]

[(年龄= 55∧吸烟=是∧体重= 240磅)→患心脏病的风险=是]

is a classification rule. Classification rules are further quantified by two numbers called support and confidence. The support is defined as

是分类规则。 分类规则通过称为支持置信度的两个数字进一步量化。 支持定义为

sup(ant → Yn) := number of rules (ant → Yn)

sup(ant→Yn):=规则数(ant→Yn)

which is a the number of classification rules matching the condition (ant → Yn), or rules matching both the antecedent and the consequent. The confidence is defined as

这是与条件(ant→Yn)匹配的分类规则的数量,或者与前件和后件都匹配的规则。 置信度定义为

conf(ant → Yn) =sup(ant → Yn)/sup(ant)

conf(ant→Yn)= sup(ant→Yn)/ sup(ant)

or the support of (ant → Yn) divided by the total number of rules with a matching antecedent only.

或(ant→Yn)的支持除以仅具有匹配前提条件的规则总数。

Action rules are an extension of classification rules:

动作规则是分类规则的扩展:

a_n= [ f ∧ (X→X’)] → (Y→Y’)

一个_n = [F∧(X→X ')]→(Y→Y')

where f is a set of fixed or unchangeable attributes. For action rules we consider the conjunction of fixed attributes and a change of non-fixed or flexible attributes from initial set X to X’ will change the outcome Y to Y’. A concrete example is

其中f是一组固定或不变的属性。 对于动作规则,我们考虑固定属性的结合,并且将非固定或灵活属性从初始集合X更改为X'会将结果Y更改为Y'。 一个具体的例子是

[(Age= 55 ∧ ( Smoking = Yes → No ∧ Weight = 240 lbs → 190lbs) → risk of heart disease =No]

[(年龄= 55岁(吸烟=是→否∧体重= 240磅→190磅)→心脏病风险=否]

Meaning in principle if our subject with age fixed, quit smoking and lost weight due to diet/exercise they would no longer be at risk of heart disease. We can again use confidences and supports to quantify the quality of the action rules discovered by some method. The support for an action rules takes into account the two classification rules r_1 = (X→Y) and r_2 = (X’→Y’)that constitute the action rule and is defined as

从原则上讲,如果我们的受试者年龄固定,戒烟并且由于饮食/锻炼而体重减轻,他们将不再有患心脏病的风险。 我们可以再次使用置信度和支持来量化通过某种方法发现的动作规则的质量。 对动作规则的支持考虑了构成动作规则的两个分类规则r _1 =(X→Y)和r _2 =(X'→Y'),定义为

sup(a_n) = min(sup(r_1), sup(r_2))

sup( a _n)= min(sup( r _1),sup( r _2))

and confidence of the action rule is defined as

动作规则的置信度定义为

conf(a_n) = conf(r_1) * conf(r_2).

conf( a _n)= conf( r _1)* conf( r _2)。

Intuitively the support of an action rule can only be as much as the minimum support of one of its classification rules and the confidence of the action will be smaller than or equal to either confidences of the classification rules.

直观上,操作规则的支持只能与其分类规则之一的最小支持一样多,并且操作的置信度将小于或等于分类规则的任一个置信度。

Finally, the uplift is defined as

最后,隆起定义为

uplift = P(outcome | treatment)-P(outcome | no treatment).

隆起= P(结果|治疗)-P(结果|未治疗)。

Any uplift model in general will attempt to estimate the two conditional probabilities above.

一般而言,任何隆升模型都会尝试估算上述两个条件概率。

Code Example

代码示例

Details of the action rules discovery algorithms can be found in the source [1] but briefly, the actionrules package incorporates a heuristic based classification and action rule discovery algorithm, running in a supervised manner (meaning we must have/specify the targets or outcome labels). We will run the action rules model on a toy customer churn dataset from a Telecom company called telco.csv.

可以在源代码中找到有关动作规则发现算法的详细信息[1],但简要地说,动作规则包结合了基于启发式的分类和动作规则发现算法,以监督的方式运行(这意味着我们必须具有/指定目标或结果标签) )。 我们将在一家名为telco.csv的电信公司的玩具客户流失数据集上运行操作规则模型。

https://github.com/hhl60492/actionrules/blob/master/notebooks/data/telco.csv

https://github.com/hhl60492/actionrules/blob/master/notebooks/data/telco.csv

According to the Kaggle dataset page [2]:

根据Kaggle数据集页面[2]:

The data set includes information about:

数据集包含有关以下信息:

  • Customers who left within the last month — the column is called Churn

    在上个月内离开的客户-该列称为“客户流失”
  • Services that each customer has signed up for — phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies

    每个客户已注册的服务-电话,多条线路,互联网,在线安全,在线备份,设备保护,技术支持以及流电视和电影
  • Customer account information — how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges

    客户帐户信息-他们成为客户的时间,合同,付款方式,无纸化账单,每月费用和总费用
  • Demographic info about customers — gender, age range, and if they have partners and dependents

    有关客户的人口统计信息-性别,年龄范围以及他们是否有伴侣和受抚养人

First we install the actionrules package using the console:

首先,我们使用控制台安装actionrules软件包:

pip install actionrules-lukassykora
# you can also call the following command in a Jupyter Notebook
# !pip install actionrules-lukassykora

Next import the relevant packages

接下来导入相关包

import pandas as pd
from actionrules.actionRulesDiscovery import ActionRulesDiscovery

Read in the dataset and check the head

读入数据集并检查头部

dataFrame = pd.read_csv(“telco.csv”, sep=”;”)
dataFrame.head()

And now instantiate the action rules model and run a fit of the model on the data

现在实例化动作规则模型,并对数据运行模型拟合

import time
actionRulesDiscovery = ActionRulesDiscovery()
actionRulesDiscovery.load_pandas(dataFrame)start = time.time()# define the stable and flexible attributes
actionRulesDiscovery.fit(stable_attributes = [“gender”, “SeniorCitizen”, “Partner”],
flexible_attributes = [“PhoneService”,
“InternetService”,
“OnlineSecurity”,
“DeviceProtection”,
“TechSupport”,
“StreamingTV”,
],
consequent = “Churn”, # outcome column
conf=60, # predefined List of confs for classification rules
supp=4, # predefined List of supports for classification rules
desired_classes = [“No”], # outcome class
is_nan=False,
is_reduction=True,
min_stable_attributes=1, # min stable attributes in antecedent
min_flexible_attributes=1 # min flexible attributes in antecedent
)end = time.time()
print(“Time: “ + str(end — start) + “s”)

The run took approximately 9 seconds on a MacBook Pro 1.4 GHz Intel Core i5

在MacBook Pro 1.4 GHz Intel Core i5上运行大约9秒钟

Next we count the number of action rules discovered:

接下来,我们计算发现的操作规则的数量:

print(len(actionRulesDiscovery.get_action_rules()))

And it turns out there were 8 action rules discovered. Let’s now take a look at what the actual rules are:

事实证明,发现了8条动作规则。 现在让我们看一下实际规则是什么:

for rule in actionRulesDiscovery.get_action_rules_representation():
print(rule)
print(“ “)

An example of one of the rules we discovered

我们发现的其中一项规则的示例

r = [(Partner: no) ∧ (InternetService: fiber optic → no) ∧ (OnlineSecurity: no → no internet service) ∧ (DeviceProtection: no → no internet service) ∧ (TechSupport: no → no internet service) ] ⇒ [Churn: Yes → No] with support: 0.06772682095697856, confidence: 0.5599898610564512 and uplift: 0.05620874238092184.

We’ve discovered an interesting phenomenon where telecom customers who are single (partner = No) with no internet service are less likely to churn by approximately 5.6%, with decently high support of 6.7% and confidence of 55%.

我们发现了一个有趣的现象,即没有互联网服务的单身(合作伙伴=否)电信客户的流失率降低了约5.6%,其中较高的支持率为6.7%,信心为55%。

This suggests from a business perspective that perhaps we need to tone down on aggressively marketing add-on internet services to single customers (Partner: no) in the short term to reduce churn, but will need to eventually design a better marketing strategy targeting that demographic in the future, as we still want to sell as much additional internet services as possible as a telecom company.

从业务角度来看,这可能表明我们可能需要在短期内积极降低向单个客户营销附加互联网服务的可能性(合作伙伴:否),以减少客户流失,但最终需要针对该人群设计更好的营销策略将来,由于我们仍想像电信公司一样出售尽可能多的其他互联网服务。

The notebook with the code example and results above is here:

上面带有代码示例和结果的笔记本在这里:

https://github.com/hhl60492/actionrules/blob/master/notebooks/Telco%20-%20Action%20Rules.ipynb

https://github.com/hhl60492/actionrules/blob/master/notebooks/Telco%20-%20Action%20Rules.ipynb

Feel free to play around with the flexible attributes and the confidence and support minimums, as modifying those hyper parameters can give different results.

随意使用灵活的属性,置信度和支持最小值,因为修改这些超级参数可以得出不同的结果。

Conclusion

结论

We saw how classification rules and action rules were defined, confidence and support values for each, the difference between fixed and flexible attributes and an example of how the action rules can be modelled using supervised learning and the actionrules Python package.

我们看到了如何定义分类规则和动作规则,它们各自的置信度和支持值,固定属性和灵活属性之间的区别,以及如何使用监督学习和Actionrules Python软件包对动作规则进行建模的示例。

Finding action rules with high support, confidence and high uplift can give business stakeholders new insight on which actions to take in order to maximize a certain outcome.

寻找具有高度支持,信心和高度提升的行动规则,可以为业务利益相关者提供新的见解,使其可以采取哪些行动以最大程度地实现特定结果。

[1] Sýkora, Lukáš, and Tomáš Kliegr. “Action Rules: Counterfactual Explanations in Python.” RuleML Challenge 2020. CEUR-WS. http://ceur-ws.org/Vol-2644/paper36.pdf

[1]Sýkora,Lukáš和TomášKliegr。 “操作规则:Python中的反事实解释。” RuleML Challenge2020。CEUR-WS。 http://ceur-ws.org/Vol-2644/paper36.pdf

[2] https://www.kaggle.com/blastchar/telco-customer-churn

[2] https://www.kaggle.com/blastchar/telco-customer-churn

翻译自: https://towardsdatascience.com/action-rules-discovery-using-machine-learning-1cba6cd680d7

机器学习规则学习

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值