2023高教社数学建模国赛C题 - 蔬菜类商品的自动定价与补货决策(完整代码)

 商超在现代经济中扮演着重要角色,但在蔬菜商品管理方面面临多重挑战。这些挑战包括准确预测销售趋势、合理制定价格策略和有效制定补货计划等。解决这些问题对商超至关重要,因为它们直接影响销售收益、库存成本和客户满意度。因此,本研究旨在为商超提供一套全面的蔬菜商品管理策略,以帮助它们更好地应对这些挑战。这些策略包括数据分析、市场调研、供应链优化和定价策略等。通过综合运用这些策略,商超能够更准确地预测市场需求、优化库存、制定合理的价格策略,并提高客户满意度。

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import pandas as pd
# Load the provided files
attachment_1 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件1.xlsx")
attachment_2 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件2.xlsx")
# Display the first few rows of each dataset for a preliminary inspection
attachment_1.head(), attachment_2.head()


# In[2]:


# Merge the datasets on "单品编码"
merged_data = pd.merge(attachment_2, attachment_1, on="单品编码", how="left")
# Display the first few rows of the merged dataset
merged_data.head()


# In[3]:


# Save the merged dataset to your desktop
merged_data.to_excel(r"C:\Users\86136\Desktop\merged_dataset.xlsx", index=False)


# In[11]:


# 加载数据
merged_data = pd.read_excel(r"C:\Users\86136\Desktop\merged_dataset.xlsx")
# 按品类和商品分类销售量
category_sales = merged_data.groupby('分类名称')['销量(千克)'].sum().sort_values(ascending=False)
product_sales = merged_data.groupby('单品名称')['销量(千克)'].sum().sort_values(ascending=False)


# In[25]:


# Loading the provided files
attachment1 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件1.xlsx")
attachment2 = pd.read_csv(r"C:\Users\86136\Desktop\附件2.csv")
attachment3 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件3.xlsx")
attachment4 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件4.xlsx")
import matplotlib.pyplot as plt

# Merging attachment1 with attachment2 to get category names for each sale
merged_sales_data = attachment2.merge(attachment1, on='单品编码', how='left')

# Grouping the merged data by '分类名称' (Category Name) to get the total sales volume for each category
category_sales_distribution = merged_sales_data.groupby('分类名称')['销量(千克)'].sum().sort_values(ascending=False)

# Plotting the sales distribution for each category
plt.figure(figsize=(10,6))
category_sales_distribution.plot(kind='bar', color='skyblue')
plt.title("销售量分布 - 蔬菜品类")
plt.ylabel("销售量 (千克)")
plt.xlabel("蔬菜品类")
plt.xticks(rotation=45)
plt.grid(axis='y')

plt.show()



# In[26]:


# Grouping the merged data by '单品名称' (Product Name) to get the total sales volume for each product
product_sales_distribution = merged_sales_data.groupby('单品名称')['销量(千克)'].sum().sort_values(ascending=False)

# Plotting the sales distribution for the top 10 products
plt.figure(figsize=(12,7))
product_sales_distribution.head(10).plot(kind='bar', color='lightgreen')
plt.title("销售量分布 - 单品 (Top 10)")
plt.ylabel("销售量 (千克)")
plt.xlabel("单品名称")
plt.xticks(rotation=45)
plt.grid(axis='y')

plt.show()


# In[29]:


import matplotlib.pyplot as plt

# Group by '分类名称' and '单品名称' and sum the sales volume
category_sales = merged_data.groupby('分类名称')['销量(千克)'].sum().sort_values(ascending=False)
product_sales = merged_data.groupby('单品名称')['销量(千克)'].sum().sort_values(ascending=False)

# Plotting
fig, ax = plt.subplots(2, 1, figsize=(12, 12))

# Plot for Category Sales
category_sales.plot(kind='bar', ax=ax[0], color='teal')
ax[0].set_title('Total Sales Volume by Category')
ax[0].set_ylabel('Sales Volume (kg)')
ax[0].set_xlabel('Category Name')

# Plot for Product Sales (Top 10 products)
product_sales.head(10).plot(kind='bar', ax=ax[1], color='coral')
ax[1].set_title('按时间分类的销售量趋势')
ax[1].set_ylabel('销售额 (kg)')
ax[1].set_xlabel('产品名称')

plt.tight_layout()
plt.show()


# In[30]:


# Group by '销售日期' and '分类名称' and sum the sales volume
category_date_sales = merged_data.groupby(['销售日期', '分类名称'])['销量(千克)'].sum().reset_index()

# Plotting
plt.figure(figsize=(16, 8))
for category in category_date_sales['分类名称'].unique():
    subset = category_date_sales[category_date_sales['分类名称'] == category]
    plt.plot(subset['销售日期'], subset['销量(千克)'], label=category)

plt.title('按时间分类的销售量趋势')
plt.xlabel('时间')
plt.ylabel('销量 (kg)')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()


# In[36]:


# Pivot table to get sales data in wide format for correlation analysis
correlation_data = category_date_sales.pivot(index='销售日期', columns='分类名称', values='销量(千克)')

# Compute the correlation matrix
correlation_matrix = correlation_data.corr()

# Plot the heatmap for correlation matrix
plt.figure(figsize=(10, 6))
plt.title('Correlation Between Sales Volumes of Different Categories')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
cax = plt.matshow(correlation_matrix, cmap='coolwarm', vmin=-1, vmax=1)
plt.colorbar(cax)
plt.xticks(range(len(correlation_matrix.columns)), correlation_matrix.columns)
plt.yticks(range(len(correlation_matrix.columns)), correlation_matrix.columns)
plt.show()


# In[37]:


#问题二


# In[38]:


# Load the provided files for the next question
attachment_3 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件3.xlsx")
attachment_4 = pd.read_excel(r"C:\Users\86136\Desktop\C题(1)\附件4.xlsx")

# Display the first few rows of each dataset for a preliminary inspection
attachment_3.head(), attachment_4.head()


# In[50]:


from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Group by date and category and sum the sales volume
daily_sales = merged_data.groupby(['销售日期', '分类名称'])['销量(千克)'].sum().reset_index()

# Create a function to forecast the sales for a given category using a simple linear regression model
def forecast_sales(category):
    category_data = daily_sales[daily_sales['分类名称'] == category].reset_index(drop=True)
    category_data['day_num'] = np.arange(len(category_data))
    
    X = category_data[['day_num']]
    y = category_data['销量(千克)']
    
    # Split the data into training and testing sets (80% train, 20% test)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, shuffle=False)
    
    # Train a linear regression model
    model = LinearRegression()
    model.fit(X_train, y_train)
    
    # Predict the sales for the test set
    y_pred = model.predict(X_test)
    
    # Calculate the mean squared error of the prediction
    mse = mean_squared_error(y_test, y_pred)
    
    # Predict the sales for the next 7 days
    next_7_days = np.array([max(category_data['day_num']) + i for i in range(1, 8)]).reshape(-1, 1)
    future_forecast = model.predict(next_7_days)
    
    return future_forecast, mse
    

# Forecast the sales for each category
forecasts = {}
errors = {}
for category in daily_sales['分类名称'].unique():
    forecasts[category], errors[category] = forecast_sales(category)

forecasts, errors


# In[51]:


# Re-generating the day_num column for the entire daily_sales DataFrame
daily_sales['day_num'] = daily_sales.groupby('分类名称').cumcount()

# Visualization based on forecasts and errors

# Plotting sales forecast vs actual sales for each category
plt.figure(figsize=(16, 8))

for category in daily_sales['分类名称'].unique():
    category_data = daily_sales[daily_sales['分类名称'] == category].reset_index(drop=True)
    X = category_data[['day_num']]
    y = category_data['销量(千克)']
    _, X_test, _, y_test = train_test_split(X, y, test_size=0.2, random_state=42, shuffle=False)
    
    # Plotting the actual test data
    plt.scatter(X_test.day_num, y_test, label=f"Actual {category}", marker='x')
    
    # Plotting the forecasted data
    next_7_days = np.array([max(category_data['day_num']) + i for i in range(1, 8)])
    plt.plot(next_7_days, forecasts[category], label=f"Forecast {category}")

plt.title('销售预测与实际销售')
plt.xlabel('时间')
plt.ylabel('销量 (kg)')
plt.legend(loc='upper left')
plt.grid(True, which='both', linestyle='--', linewidth=0.5)
plt.tight_layout()
plt.show()

# Plotting the Mean Squared Error for each category
plt.figure(figsize=(10, 6))
plt.bar(errors.keys(), errors.values(), color='skyblue')
plt.title('每个类别的MSE')
plt.xlabel('类别')
plt.ylabel('MSE')
plt.grid(axis='y', linestyle='--', linewidth=0.5)
plt.tight_layout()
plt.show()


# In[53]:


# Merge attachment 3 (wholesale prices) with attachment 1 (product and category info)
pricing_data = pd.merge(attachment_3, attachment_1, on="单品编码", how="left")

# Average wholesale price for each category in the past
avg_wholesale_price = pricing_data.groupby('分类名称')['批发价格(元/千克)'].mean()

# Merge the average wholesale price with the loss rate from attachment 4
cost_data = pd.merge(avg_wholesale_price, attachment_4, left_on='分类名称', right_on='小分类名称', how="left")

# Calculate the cost per kg considering the loss rate
cost_data['cost_per_kg'] = cost_data['批发价格(元/千克)'] * (1 + cost_data['平均损耗率(%)_小分类编码_不同值'] / 100)

# Extract the relevant columns
cost_data = cost_data[['小分类名称', 'cost_per_kg']]

cost_data


# In[54]:


# Define a function to set the markup rate based on forecasted sales volume
def determine_markup_rate(sales_forecast):
    if sales_forecast < 20:
        return 1.5  # 150% markup for low volume items
    elif sales_forecast < 50:
        return 1.3  # 130% markup for medium volume items
    else:
        return 1.2  # 120% markup for high volume items

# Determine the price strategy for the next week
price_strategy = {}
for category, forecast in forecasts.items():
    cost = cost_data[cost_data['小分类名称'] == category]['cost_per_kg'].values[0]
    markup_rates = [determine_markup_rate(f) for f in forecast]
    prices = [cost * rate for rate in markup_rates]
    price_strategy[category] = prices

price_strategy


# In[55]:


#问题三


# In[58]:


# Group by '单品名称' and get the average sales for the entire dataset
avg_sales_overall = merged_data.groupby('单品名称')['销量(千克)'].mean().reset_index()

# Use this as the forecasted sales for July 1st
forecasted_sales_july1_overall = avg_sales_overall.copy()
forecasted_sales_july1_overall.rename(columns={'销量(千克)': '预测销量_7月1日'}, inplace=True)

forecasted_sales_july1_overall.head()


# In[59]:


# Merge forecasted sales with pricing and loss rate data
product_cost_data = pd.merge(forecasted_sales_july1_overall, pricing_data, on="单品名称", how="left")
product_cost_data = pd.merge(product_cost_data, attachment_4, left_on='分类名称', right_on='小分类名称', how="left")

# Calculate the cost per kg considering the loss rate
product_cost_data['cost_per_kg'] = product_cost_data['批发价格(元/千克)'] * (1 + product_cost_data['平均损耗率(%)_小分类编码_不同值'] / 100)

product_cost_data[['单品名称', '预测销量_7月1日', '批发价格(元/千克)', '平均损耗率(%)_小分类编码_不同值', 'cost_per_kg']].head()


# In[60]:


# Define a function to set the markup rate based on forecasted sales volume for individual products
def determine_product_markup_rate(sales_forecast):
    if sales_forecast < 0.5:
        return 1.5  # 150% markup for low volume items
    elif sales_forecast < 1:
        return 1.3  # 130% markup for medium volume items
    else:
        return 1.2  # 120% markup for high volume items

# Calculate the expected profit for each product
product_cost_data['markup_rate'] = product_cost_data['预测销量_7月1日'].apply(determine_product_markup_rate)
product_cost_data['expected_price'] = product_cost_data['cost_per_kg'] * product_cost_data['markup_rate']
product_cost_data['expected_profit_per_kg'] = product_cost_data['expected_price'] - product_cost_data['cost_per_kg']
product_cost_data['total_expected_profit'] = product_cost_data['expected_profit_per_kg'] * product_cost_data['预测销量_7月1日']

# Sort the products based on the expected profit
sorted_products = product_cost_data.sort_values(by='total_expected_profit', ascending=False).drop_duplicates(subset='单品名称')

sorted_products[['单品名称', '预测销量_7月1日', 'cost_per_kg', 'expected_price', 'total_expected_profit']].head()


# In[64]:


# Select top 27-33 products based on the expected profit
selected_products = sorted_products.head(33)

# Calculate the total expected profit for these selected products
total_expected_profit = selected_products['total_expected_profit'].sum()

selected_products[['单品名称', '预测销量_7月1日', 'cost_per_kg', 'expected_price', 'total_expected_profit']], total_expected_profit


# In[65]:


# Set the replenishment volume for each selected product
min_display_volume = 2.5
selected_products['replenishment_volume'] = selected_products['预测销量_7月1日'].apply(lambda x: max(x, min_display_volume))

# Display the selected products, their expected price and replenishment volume
selected_products_final = selected_products[['单品名称', 'expected_price', 'replenishment_volume']]
selected_products_final
selected_products_final.to_excel(r"C:\Users\86136\Desktop\selected_products_final.xlsx", index=False)


# In[ ]:

  • 2
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

布凯彻-劳斯基

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值