一. 提出问题
1.1 将产品进行分类:热销产品,潜力产品,引流产品,滞销产品
1.2 地区销售贡献和消费喜好
1.3 细分客户销售贡献和消费喜好
二.收集数据
2.1 项目数据来源kaggle平台,Superstore Dataset:Superstore Dataset | Kaggle,共99018条数据
2.2 数据名词解释:
Row ID => Unique ID for each row.
Order ID => Unique Order ID for each Customer.
Order Date => Order Date of the product.
Ship Date => Shipping Date of the Product.
Ship Mode=> Shipping Mode specified by the Customer.
Customer ID => Unique ID to identify each Customer.
Customer Name => Name of the Customer.
Segment => The segment where the Customer belongs.
Country => Country of residence of the Customer.
City => City of residence of of the Customer.
State => State of residence of the Customer.
Postal Code => Postal Code of every Customer.
Region => Region where the Customer belong.
Product ID => Unique ID of the Product.
Category => Category of the product ordered.
Sub-Category => Sub-Category of the product ordered.
Product Name => Name of the Product
Sales => Sales of the Product.
Quantity => Quantity of the Product.
Discount => Discount provided.
Profit => Profit/Loss incurred.
三.数据处理
3.1 先对原表进行备份
3.2 处理异常值
3.2.1原表数据错位,产生异常
<