前言
💖💖作者:计算机程序员小杨
💙💙个人简介:我是一名计算机相关专业的从业者,擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。热爱技术,喜欢钻研新工具和框架,也乐于通过代码解决实际问题,大家有技术代码这一块的问题可以问我!
💛💛想说的话:感谢大家的关注与支持!
💕💕文末获取源码联系 计算机程序员小杨
💜💜
网站实战项目
安卓/小程序实战项目
大数据实战项目
深度学习实战项目
计算机毕业设计选题
💜💜
一.开发工具简介
大数据框架:Hadoop+Spark(本次没用Hive,支持定制)
开发语言:Python+Java(两个版本都支持)
后端框架:Django+Spring Boot(Spring+SpringMVC+Mybatis)(两个版本都支持)
前端:Vue+ElementUI+Echarts+HTML+CSS+JavaScript+jQuery
详细技术点:Hadoop、HDFS、Spark、Spark SQL、Pandas、NumPy
数据库:MySQL
二.系统内容简介
《中式早餐店订单数据分析与可视化系统》是一套基于大数据技术的餐饮业务智能分析平台,采用Hadoop分布式存储架构和Spark大数据计算引擎,结合Django后端框架与Vue前端技术栈构建。系统通过HDFS实现海量订单数据的分布式存储,利用Spark SQL进行高效的数据查询与分析,配合Pandas、NumPy等数据科学库完成复杂的统计计算。前端采用Vue框架搭配ElementUI组件库构建用户界面,通过Echarts图表库实现数据的动态可视化展示。系统核心功能涵盖用户权限管理、订单数据采集与处理、多维度销售业绩分析、商品销售趋势挖掘、顾客消费行为画像分析、门店运营效率评估以及实时数据监控大屏等模块,为中式早餐店的数字化运营提供全面的数据支撑和决策依据。
三.系统功能演示
【大数据】中式早餐店订单数据分析与可视化系统 计算机毕业设计项目 Hadoop+Spark环境配置 数据科学与大数据技术 附源码+文档+讲解
四.系统界面展示
五.系统源码展示
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, sum, avg, count, date_format, when, desc
from django.http import JsonResponse
from django.db import connection
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
spark = SparkSession.builder.appName("BreakfastOrderAnalysis").config("spark.sql.adaptive.enabled", "true").getOrCreate()
def sales_performance_analysis(request):
start_date = request.GET.get('start_date')
end_date = request.GET.get('end_date')
store_id = request.GET.get('store_id')
query = f"""
SELECT DATE(order_time) as order_date, SUM(total_amount) as daily_sales,
COUNT(order_id) as order_count, AVG(total_amount) as avg_order_value
FROM breakfast_orders
WHERE order_time BETWEEN '{start_date}' AND '{end_date}'
AND store_id = {store_id} AND order_status = 'completed'
GROUP BY DATE(order_time)
ORDER BY order_date
"""
df = spark.sql(query)
sales_trend = df.toPandas()
total_sales = sales_trend['daily_sales'].sum()
total_orders = sales_trend['order_count'].sum()
avg_daily_sales = sales_trend['daily_sales'].mean()
peak_sales_day = sales_trend.loc[sales_trend['daily_sales'].idxmax()]
growth_rate = ((sales_trend['daily_sales'].iloc[-1] - sales_trend['daily_sales'].iloc[0]) / sales_trend['daily_sales'].iloc[0]) * 100
weekly_comparison = sales_trend.groupby(sales_trend['order_date'].dt.isocalendar().week).agg({
'daily_sales': 'sum', 'order_count': 'sum'
}).reset_index()
performance_metrics = {
'total_sales': float(total_sales),
'total_orders': int(total_orders),
'avg_daily_sales': float(avg_daily_sales),
'peak_day': peak_sales_day['order_date'].strftime('%Y-%m-%d'),
'peak_sales': float(peak_sales_day['daily_sales']),
'growth_rate': float(growth_rate),
'weekly_data': weekly_comparison.to_dict('records'),
'trend_data': sales_trend.to_dict('records')
}
return JsonResponse(performance_metrics)
def product_sales_analysis(request):
time_period = request.GET.get('period', '30')
store_id = request.GET.get('store_id')
end_date = datetime.now()
start_date = end_date - timedelta(days=int(time_period))
product_query = f"""
SELECT p.product_name, p.category, SUM(oi.quantity) as total_quantity,
SUM(oi.subtotal) as total_revenue, AVG(oi.unit_price) as avg_price,
COUNT(DISTINCT oi.order_id) as order_frequency
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
JOIN breakfast_orders bo ON oi.order_id = bo.order_id
WHERE bo.order_time BETWEEN '{start_date}' AND '{end_date}'
AND bo.store_id = {store_id} AND bo.order_status = 'completed'
GROUP BY p.product_id, p.product_name, p.category
ORDER BY total_revenue DESC
"""
product_df = spark.sql(product_query)
product_data = product_df.toPandas()
product_data['revenue_percentage'] = (product_data['total_revenue'] / product_data['total_revenue'].sum()) * 100
product_data['profit_margin'] = product_data['total_revenue'] * 0.3
category_analysis = product_data.groupby('category').agg({
'total_quantity': 'sum',
'total_revenue': 'sum',
'order_frequency': 'sum'
}).reset_index()
category_analysis['category_share'] = (category_analysis['total_revenue'] / category_analysis['total_revenue'].sum()) * 100
top_products = product_data.head(10)
slow_moving = product_data[product_data['total_quantity'] < product_data['total_quantity'].quantile(0.2)]
bestseller_analysis = {
'top_products': top_products.to_dict('records'),
'category_performance': category_analysis.to_dict('records'),
'slow_moving_items': slow_moving.to_dict('records'),
'total_products': len(product_data),
'avg_revenue_per_product': float(product_data['total_revenue'].mean())
}
return JsonResponse(bestseller_analysis)
def customer_behavior_analysis(request):
analysis_period = request.GET.get('period', '90')
store_id = request.GET.get('store_id')
end_date = datetime.now()
start_date = end_date - timedelta(days=int(analysis_period))
customer_query = f"""
SELECT customer_id, COUNT(order_id) as order_frequency,
SUM(total_amount) as total_spent, AVG(total_amount) as avg_order_value,
MIN(order_time) as first_order, MAX(order_time) as last_order,
EXTRACT(HOUR FROM order_time) as order_hour
FROM breakfast_orders
WHERE order_time BETWEEN '{start_date}' AND '{end_date}'
AND store_id = {store_id} AND order_status = 'completed'
GROUP BY customer_id
"""
customer_df = spark.sql(customer_query)
customer_data = customer_df.toPandas()
customer_data['days_since_first'] = (pd.to_datetime(customer_data['last_order']) - pd.to_datetime(customer_data['first_order'])).dt.days
customer_data['purchase_frequency'] = customer_data['order_frequency'] / (customer_data['days_since_first'] + 1)
customer_data['customer_value_score'] = (customer_data['total_spent'] * 0.4 + customer_data['order_frequency'] * 0.6)
high_value_customers = customer_data[customer_data['customer_value_score'] > customer_data['customer_value_score'].quantile(0.8)]
churn_risk_customers = customer_data[(datetime.now() - pd.to_datetime(customer_data['last_order'])).dt.days > 14]
peak_hours = customer_data.groupby('order_hour')['customer_id'].count().sort_values(ascending=False)
customer_segments = pd.cut(customer_data['total_spent'], bins=3, labels=['Low', 'Medium', 'High'])
segment_analysis = customer_segments.value_counts()
repeat_customers = customer_data[customer_data['order_frequency'] > 1]
repeat_rate = len(repeat_customers) / len(customer_data) * 100
behavior_insights = {
'total_customers': len(customer_data),
'repeat_customer_rate': float(repeat_rate),
'avg_order_frequency': float(customer_data['order_frequency'].mean()),
'avg_customer_value': float(customer_data['total_spent'].mean()),
'high_value_customers': len(high_value_customers),
'churn_risk_count': len(churn_risk_customers),
'peak_hour': int(peak_hours.index[0]),
'customer_segments': segment_analysis.to_dict(),
'high_value_list': high_value_customers.head(20).to_dict('records')
}
return JsonResponse(behavior_insights)
六.系统文档展示
结束
💕💕文末获取源码联系 计算机程序员小杨