数据分析的生命周期
数据科学团队的七个关键角色
• Business User; Project Sponsor; Project Manager
• Business Intelligence Analyst
• Database Administrator
• Data Engineer; Data Scientist
•业务用户;项目发起人;项目经理
•商业智能分析师
•数据库管理员
•数据工程师;数据科学家
数据分析生命周期概述
• Six phases
- Discovery
- Data preparation (analytic sandbox)
- Model planning (methods, techniques, workflow, variables, relationships, models)
- Model building (training and test datasets, software, and hardware)
- Communication results (identify key findings)
- Operationalize (delivery, pilot project)
• 六个阶段
- 发现
- 数据准备(分析沙盒)
- 模型规划(方法、技术、工作流程、变量、关系、模型)
- 模型构建(训练和测试数据集、软件和硬件)
- 沟通结果(确定关键发现)
- 实施(交付、试点项目)
Phase 1: Discovery
• Learning the Business Domain
• Resources
• Framing (scoping) the Problem
• Identifying Key Stakeholders
• Interviewing the Analytical Sponsor
• Developing Initial Hypotheses (IH)
• Identifying Potential Data Sources