文章目录
1. 课程介绍
- Machine Learning (ML) in Industry机器学习在工业界的应用
- a decade ago ML was mainly used by "Big Tech"十年前机器学习主要是在大厂使用
- it’s common for comanies using ML to drive revenues;现在很多公司常常用机器学习来提高他们的收入
- Top segments are:high-tech,automotive,manufacturing,retail,finance,healthcare;主要领域有:高科技,自动驾驶,制造业,零售业,金融,健康;
- Covid-19 accelerated this process 新冠疫情加速了这个趋势;
- 麦肯锡AI应用收入分析报告
2. industrial ML applications AI在工业界的应用
(1)manufacturing制造业:predictive maintenance可预见维护,quality control 质量管控
(2)retail 零售: recommendation 推荐系统,chatbot聊天机器人,demand forecasting 需求预测
(3)healthcare 健康: alerts from real-time patient data 患者实时监控,disease identification 疾病诊断
(4)Finance 金融:fraud detection 欺诈检测,application processing 应用处理
(5)automobile 汽车: breakdown prediction故障检测,self-driving自动驾驶
-
house sales prediction 房屋销售预测
the goal is to predict the bid price for the winning buyer 目的是为了预测中标的买家的价格 -
ML workflow 机器学习的流程
3. challenges 挑战
(1)formulate problem 问题公式化描述,focus on the most impactful industrial problems(self-service supermarket,sel-driving cars)关注最具影响力的行业问题(自助超市、自动驾驶汽车)
(2)data数据:high-quality data is scarce,privacy issues高质量的数据稀缺,还涉及到隐私问题
(3)train models 训练模型:models are more and more complex ,data-hungry,expensive 模型越来越复杂,数据匮乏,昂贵
(4)deploy model模型部署:heavy computation is not suitable for real-time inference 繁重的计算量不适用于实时推理
(5)monitor监控:data distributions shifts,fairness issues 数据分布变化,公平性问题
- Rolse 角色
(1)domain experts 领域专家:have business insights,know what data is important and where to find it ,identify the real impact of a ML model 有业务洞察力,知道什么数据是重要的,在哪里可以找到它,确定ML模型的真正影响
(2)data scientistes数据科学家:full stack on data mining ,model training and deployment 数据挖掘、模型培训和部署的全栈科学家
(3)ML experts 机器学习专家:customize SOTA ML models 定制化 SOTA 机器学习模型
(4)SDE高级软件工程师:develop/maintain data pipelines,model training and serving piplines 开发/维护数据管道,模型培训和服务管道 - 职业规划&数据科学家时间分配问题
4. course topics 课程主题
techniques a data scientist needs but often not taught in university ML/stats/programming courses这些技术是数据科学家需要的,但通常不会在大学ML/stats/编程课程中教授
(1)data数据:collect/preprocess data 收集和预处理数据 ;covariate/concepts/label shifts 协变量/概念/标签的变化
(2)train训练:model validation/combinations/tuning 模型验证/组合/调优;transfer learning迁移学习;multi-modality多模态
(3)deploy部署: model deployment 模型部署;distillation 模型蒸馏
(4)monitor监控:fairness公平,explainablity可解释性
5. summary 小结
(1)companies are adopting ML 大量公司正在搞机器学习
(2)a ML workflow includes:formulating the problem,preparing data,training and deploying ML models,monitoring一个ML工作流包括:制定问题、准备数据、训练和部署ML模型、监控
(3)this course will teach technologies a data scientist needs in ML workflows stages 本课程将教授数据科学家在ML工作流阶段所需要的技术