会员业务分析 机器学习_业务经理如何启动机器学习项目

会员业务分析 机器学习

You are an innovator.

您是创新者。

You are curious about Machine Learning.

您对机器学习感到好奇。

You have an idea that could improve your business by leveraging the information hidden in a set of data you already have or could collect.

您有一个想法,可以利用隐藏在一组已经拥有或可以收集的数据中的信息来改善您的业务。

As an expert in your industry, you are often the only person who can detect an opportunity of new income, cost reduction, faster decisions. But getting business value from data requires the combination of different skills from different people: business vision, mathematical skills, technological skills.

作为您所在行业的专家,您通常是唯一可以发现新收入,降低成本,更快做出决定的机会的人。 但是从数据中获得业务价值需要来自不同人员的不同技能的组合:业务愿景,数学技能和技术技能。

How to turn your idea into value?

如何将您的想法变成价值?

Machine learning uses data to create self-modifying algorithms, which can “learn” to produce the desired information. The machine learning practitioner will create algorithms and “train” them to match expectations.

机器学习使用数据来创建自我修改算法,该算法可以“学习”以生成所需的信息。 机器学习从业者将创建算法并“训练”它们以符合期望。

Your project will involve collecting and preparing data, training mathematical algorithms, developing a digital solution to present results (from simple one-shot visualization to a dynamic dashboard or an integrated software in your systems).

您的项目将涉及收集和准备数据,训练数学算法,开发数字解决方案以呈现结果(从简单的单次可视化到动态仪表板或系统中的集成软件)。

Along the path to a successful implementation of your idea, you will have to work with internal or external partners from various roles and skills: business expert, IT people, Machine Learning expert, customers panels …

在成功实施想法的过程中,您将必须与具有各种角色和技能的内部或外部合作伙伴一起工作:业务专家,IT人员,机器学习专家,客户小组……

In some cases, you will have to first sell your project to a sponsor and get a go and a budget for your project.

在某些情况下,您将必须先将项目出售给赞助商,然后获得项目预算。

To succeed with all these people, you will have to find a common language.

要在所有这些人身上取得成功,您将必须找到一种共同的语言。

I have listed 9 topics you should work on to explain your idea and launch your project.

我列出了9个主题,您应该在这些主题上进行解释并启动项目。

Here is a template you could use to refine your project, and some advice to fill each part of the template.

这是一个模板,您可以使用它来完善您的项目,并提供一些建议来填充模板的每个部分。

Image for post

1.想法 (1. Idea)

As a business leader you spotted an opportunity to improve your business: increase income, lower costs, make better decisions.

作为业务主管,您发现了改善业务的机会:增加收入,降低成本,做出更好的决策。

You think that you can use available or reachable data do learn new information.

您认为可以使用可用或可访问的数据来学习新信息。

In this part, you want to explain:

在这一部分中,您要解释:

  • The information you want to learn ex: customer profile, sales predictions, text category, probability of mechanical failure, …

    例如,您想学习的信息:客户资料,销售预测,文本类别,机械故障的可能性,…
  • The data which are available or could be collected

    可用或可以收集的数据

2.商业价值 (2. Business Value)

Who will benefit from the information learned by machine learning?

谁将从机器学习中获得的信息中受益?

  • Will your customers be offered a new service?

    会为您的客户提供一项新服务吗?
  • Will you reduce the cost of an internal process?

    您会降低内部流程的成本吗?

3.结果解决方案 (3. Resulting Solution)

How will your company use the resulting machine learning algorithm?

贵公司将如何使用最终的机器学习算法?

  • Is it a one-shot study and the result will be a report? (ex: actual customer segmentation )

    它是一项一次性研究,结果将是一份报告? (例如:实际的客户细分)
  • Do you need your algorithm to be used on a regular basis to produce an updated dashboard? (ex: sales predictions)

    您是否需要定期使用算法来生成更新的仪表板? (例如:销售预测)
  • Do you need your algorithm to be integrated into your manufacturing process? (ex: automatic routing depending on image classification)

    您是否需要将算法集成到制造过程中? (例如:根据图像分类自动路由)

4.数据来源 (4. Data Sources)

You might use various groups of data. For each one, list :

您可能使用各种数据组。 对于每一个,列出:

  • Explicit content of data

    数据的显式内容
  • Source of data: where was data captured, from whom (customer purchase, mechanical sensors, twitter post, medical report…)

    数据来源:从哪里捕获数据,从谁那里(客户购买,机械传感器,推特帖子,医疗报告…)
  • Estimated number of data items

    估计数据项数
  • Data presentation: Are the data already structured (ex: a spreadsheet of transactions) or do they need a pre-processing work (ex: extracting meaningful value from physical sensor signals, collecting text from multiple sources in different formats…)

    数据表示:数据是否已经结构化(例如:交易电子表格),或者是否需要进行预处理工作(例如:从物理传感器信号中提取有意义的价值,从不同格式的多个来源收集文本…)
  • Legal considerations relating to the storage and processing of data

    有关数据存储和处理的法律注意事项
  • Data Labellisation

    数据标签

For most of the projects, before being able to learn from your data, you need to know the “true answer” for a sufficient number of examples. Among a set of financial transactions, you need to know which were proven to be fraudulent and which were proven to not be fraudulent. Data tagged with the true answer are named “labeled data”.

对于大多数项目,在能够从数据中学习之前,您需要了解“真实答案”,以获取足够的示例。 在一组金融交易中,您需要知道哪些被证明是欺诈的,哪些被证明不是欺诈的。 标记有正确答案的数据称为“标记数据”。

Are your data already labeled?

您的数据已经贴好标签了吗?

If your data are not labeled, you should have them labeled by internal people or an external partner.

如果您的数据未贴标签,则应由内部人员或外部合作伙伴贴标签。

5.要学习的信息 (5. Information to be learned)

In this part, you will explain how a machine learning practitioner will learn the desired information from available data.

在这一部分中,您将解释机器学习从业者将如何从可用数据中学习所需的信息。

Information to be learned

要学习的信息

Explicit which result you are targeting:

明确指出您定位的结果:

  • You want to learn a category “ Is this mail a technical support demand or a commercial demand”, “ Is this transaction fraudulent or not ?”

    您想学习类别“此邮件是技术支持需求还是商业需求”,“此交易是否具有欺诈性?”
  • You want to learn a value: “ What is the estimated volume of sales for this article for the next week ?”, “ What is the probability of this machine to present a major failure before the end of the year ?”

    您想学习一个值:“下周这篇文章的预计销量是多少?”,“这台机器在年底之前出现重大故障的概率是多少?”
  • You want to discover patterns: “is it possible to segment our customers into 5 groups with similar habits ?”

    您想发现模式:“是否可以将我们的客户分为具有相似习惯的5个组?”

The machine learning practitioner will turn your project into a “classification project, multi-class classification project, regression project, clustering project …..”. It’s OK for you to use these words if they make sense for you. But you should keep your project written in words comprehensible for all partners in your project.

机器学习从业人员会将您的项目变成“分类项目,多类别分类项目,回归项目,聚类项目…..”。 如果这些字词对您有意义,那么您可以使用它们。 但是您应该以项目中所有合作伙伴都可以理解的语言来编写项目。

Performance level

性能水平

Explicit the level of reliability you need for your result.

明确显示结果所需的可靠性。

Examples:

例子:

  • We should detect at least 99.9% of fraudulent transactions

    我们应至少检测出99.9%的欺诈交易
  • We accept 10% of error in sales predictions for this category of products

    我们接受此类产品的销售预测错误的10%
  • The solution should classify an image in less than 0.5 second

    解决方案应在不到0.5秒的时间内对图像进行分类

The machine learning practitioner will turn your performance target into a mathematical definition “False-negative, F-score, Jaccard score, confidence interval,…”. Try to get from her/him a precise explanation of the measure she/he will use. This will make you sure you are on the same path.

机器学习从业人员会将您的表现目标变成数学定义“假阴性,F分数,雅卡德分数,置信区间...”。 尝试从他/他那里获得对他/他将要使用的度量的精确解释。 这将确保您位于同一条路径上。

6.可行性 (6. Feasibility)

Before starting the project, it is not possible to guarantee that the performance level for the targeted result can be obtained.

在开始项目之前,无法保证可以达到目标结果的性能水平。

Failure to learn the desired information can arise from different causes :

无法获取所需信息的原因可能有多种:

The information is not contained in the data.

该信息不包含在数据中。

If you try to estimate the age of the weather scientist based on weather data you will not succeed in your project, whatever the volume of your data, because the desired information is not contained in the data.

如果尝试根据天气数据估算天气科学家的年龄,则无论数据量多大,您的项目都不会成功,因为所需的信息未包含在数据中。

In this case, you will have to collect other sources of data to succeed.

在这种情况下,您将必须收集其他数据源才能成功。

The information is lying in your data but you don’t have enough (labeled) data.

信息位于您的数据中,但是您没有足够的(带标签)数据。

In this case, you will have to get a complementary volume of data to increase the performance level.

在这种情况下,您将必须获得补充的数据量以提高性能水平。

The developed algorithm and its technical implementation in the IT architecture cannot provide a response under the targeted time.

IT体系结构中开发的算法及其技术实现无法在目标时间内提供响应。

The algorithm and/or its implementation will have to be reworked in the hope of reaching the desired performance level.

该算法和/或其实现将必须重新设计,以期达到所需的性能水平。

At the beginning of your project, the machine learning practitioner should be able to rate the feasibility of your project based on the state of the art and the specifics of your environment. During the development of the project, she/he must keep you inform about performance results and explain the causes of low performance.

在您的项目开始时,机器学习从业人员应该能够根据最新的技术水平和环境的具体情况来评估项目的可行性。 在项目的开发过程中,她/他必须让您了解性能结果并解释性能低下的原因。

In this part of your project presentation explain the estimated feasibility of the training, based on the practitioner interview.

在项目演示的这一部分中,将根据从业人员的访谈来解释培训的估计可行性。

7.项目利益相关者 (7. Project Stakeholders)

List all teams, departments, or external partners which will be involved :

列出将涉及的所有团队,部门或外部合作伙伴:

Business teams (marketing, manufacturing, HR, …)

业务团队(营销,制造,人力资源等)

They define the business problem or the business opportunity. They evaluate the solution performance.

他们定义了业务问题或商机。 他们评估解决方案的性能。

Data Lab, machine learning practitioner …

数据实验室,机器学习从业者……

The machine learning practitioner transforms data, develop and train algorithms

机器学习从业者转换数据,开发和训练算法

IT teams

IT团队

For some projects, you need to involve your IT department to extract data from your systems or to deploy the machine learning solution for the end-users.

对于某些项目,您需要让IT部门参与进来,以从系统中提取数据或为最终用户部署机器学习解决方案。

8.步骤 (8. Steps)

This part is the place for a visual display of all steps required and the people involved.

该部分是直观显示所有必需步骤和相关人员的地方。

It typically includes all or part of the following items :

它通常包括以下全部或部分项目:

  1. Defining the business problem to be solved or the business opportunity

    定义要解决的业务问题或商机
  2. Collecting and exploring already existing data, collecting external data

    收集和探索现有数据,收集外部数据
  3. Exploratory Data Analysis: extracting pieces of evidence that the information you want to get is contained (even if hidden) in your data

    探索性数据分析:提取证据以证明您想要获取的信息包含在数据中(即使是隐藏的)
  4. Machine Learning training resulting in an algorithm which can extract the desired business information

    机器学习训练产生了可以提取所需业务信息的算法
  5. Implementation of the end-users solution: report, or dashboard, or business software

    最终用户解决方案的实施:报告,仪表板或业务软件
  6. Don’t forget the ongoing maintenance of your solution: The performance level could decrease and must be monitored.

    不要忘记您的解决方案的持续维护:性能水平可能下降,必须对其进行监视。
  • ex: Sales prediction algorithms must be periodically retrained to keep up with new trends.

    例如:必须定期重新训练销售预测算法,以跟上新趋势。
  • ex: Changes in data sources may arise ( poor quality of new data, interruption of some data source, change in input data format…)

    例如:数据源可能会发生变化(新数据质量不佳,某些数据源中断,输入数据格式发生变化……)

9.费用 (9. Costs)

Depending on the type of solution described in §3 “Resulting solution”, describe here the nature of the costs: internal people workloads, external services, and maintenance costs.

根据第3节“最终解决方案”中描述的解决方案类型,在此描述成本的性质:内部人员工作量,外部服务和维护成本。

结论 (Conclusions)

When you have covered these 9 topics, you will have a precise definition of your project. You will be able to demonstrate how you will extract value from your data. Most importantly, a common language will avoid you being trapped in data science mysteries and will empower you to manage your machine learning project towards your business objectives.

涵盖了这9个主题之后,您将对项目有一个精确的定义。 您将能够演示如何从数据中提取价值。 最重要的是,一种通用语言将避免您陷入数据科学的奥秘之中,并使您能够按照自己的业务目标管理机器学习项目。

翻译自: https://medium.com/swlh/business-manager-how-to-launch-a-machine-learning-project-3b75a42ddb6a

会员业务分析 机器学习

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值