大三课设-分类聚类预测系统

本文介绍了一项大三机器学习课程项目,要求学生掌握完整流程,涵盖数据准备、特征工程、模型选择(如KNN、贝叶斯、决策树等)、训练与调优,以及针对分类、预测和聚类任务的具体应用。通过实例展示了如何使用Python库并自定义算法,以及对不同数据集和指标的比较分析。
摘要由CSDN通过智能技术生成

大三机器学习课设

下面介绍一下我们的课设系统

首先看一下我们的课设要求:

1.熟悉机器学习的完整流程,包括:问题建模,获取数据,特征工程,模型训练,模型调优,线上运行;或者分为三大块:数据准备与预处理,模型选择与训练,模型验证与参数调优。
2.绘制机器学习算法分类归纳思维导图,按照有监督学习、无监督学习、半监督学习和强化学习进行绘制,对学过的算法进行归纳总结。
3.自行选择学习任务,按照机器学习流程,分别设计分类、预测、聚类系统,每个系统务必选择不同的算法进行训练,采用多种方法进行模型验证与参数调优,选择适合的多个指标对模型进行评估,采用可视化方法对结果进行分析。
(1)分类算法:
k-近邻算法、贝叶斯分类器、决策树分类、BP神经网络、AdaBoost、GBDT、随机森林、逻辑回归等
(2)预测:贝叶斯网络、马尔科夫模型、线性回归、XGBoost、岭回归、多项式回归、决策树回归、深度神经网络预测
(3)聚类:K-means、层次聚类BIRCH、密度聚类DBSCAN算法、高斯混合聚类GMM、密度聚类的OPTICS算法、基于网格的聚类(STING、CLIQUE)、Mean Shift聚类算法
其中:蓝色标注的算法要求必须在问题中使用,红色标注的为选用(至少选一种,多选加分),黑色的可不用,如用则有加分
4.要求
(1)所选用算法可直接调用Python中的相关库函数实现,但要对其源码进行分析,厘清算法结构及各部分功能。也可自行编写相关算法,并与库函数进行对比实验
(2)数据集的选择要分为小数据集、中等规模数据集、大规模数据集,数据集类型应有结构化、半结构化以及非结构化数据集。
(3)同一类算法中要实现各个算法在不同数据集、不同指标的比较
(4)算法设计中要有较详细的注释说明,对每个模块给出详细解释、功能注释等

接下来先看一下我们的RGB系统的界面(因为界面很丑纯色图设计的 所以称为RGB系统)

  1. 主界面
    • 主界面设置了四个Button,前三个分别进入一个子系统,最后一个退出系统
      在这里插入图片描述
  2. 点击分类 即可进入分类子系统,分类这里我们选择了7种算法,数据集选择了大中小三种数据集,最后也可以自己自定义输入文件路径来导入数据。
    在这里插入图片描述
  3. 选中我们想要的算法和数据集,点击“run it”他就会运行,在右上方显示出算法评价指标和所选算法的值
    在这里插入图片描述
  4. 点击“next pic”即可出现相应算法在测试数据下的分类结果可视化。
    在这里插入图片描述
  5. 因为一次无法出现多张图片,所以这里不断点击“next pic”即可在多个算法结果可视化中循环切换。
    在这里插入图片描述
  6. 选择diy数据的时候,会出现一个输入框,输入我们想导入的文件的路径即可
    在这里插入图片描述
  7. 因为系统的子模块都是CV的,所以聚类和预测系统就不多介绍了,接下来上代码。
代码:

main.py

import os
import tkinter as tk
import matplotlib.pyplot as plt
plt.title("")


def run_classfiy():
    os.system(r'python UI_classfiy.py')


def run_cluster():
    os.system(r'python UI_Cluster.py')


def run_forecast():
    os.system(r'python UI_forecast.py')


window = tk.Tk()
window.title("machine learning")
window.geometry("300x400")  # 窗口大小
var = tk.StringVar()
tk.Label(window, text="请选择要进行的操作", font=("微软雅黑", 12)).pack()
tk.Button(window, text="分类", font=("微软雅黑", 12), width=15, height=2, command=lambda: run_classfiy()).pack()
tk.Button(window, text="聚类", font=("微软雅黑", 12), width=15, height=2, command=lambda: run_cluster()).pack()
tk.Button(window, text="预测", font=("微软雅黑", 12), width=15, height=2, command=lambda: run_forecast()).pack()
tk.Button(window, text="退出", font=("微软雅黑", 12), width=15, height=2, command=lambda: quit()).pack()
window.mainloop()  # 点击时循环更新数据

classfiy.py

# k-近邻算法、
# 贝叶斯分类器、
# 决策树分类、
# AdaBoost、
# GBDT、
# 随机森林、
# 逻辑回归、
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn import tree
from sklearn.ensemble import AdaBoostClassifier, RandomForestClassifier, GradientBoostingRegressor
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
import numpy as np


class Classfiy(object):
    def __init__(self, x_train, y_train, x_test, y_test):
        self.x_train = x_train
        self.y_train = y_train
        self.x_test = x_test
        self.y_test = y_test
        self.KNN_pred, self.beyes_pred, self.DT_pred, self.AdaBoost_pred, self.RF_pred, self.LR_pred, self.GBDT_pred \
            = 0, 0, 0, 0, 0, 0, 0

    def KNN(self, k=5, p=2):
        knn = KNeighborsClassifier(n_neighbors=k, p=p, metric='minkowski')
        knn.fit(self.x_train, self.y_train)
        self.KNN_pred = knn.predict(self.x_test)
        self.save_pic("KNN", self.KNN_pred)

    def beyes(self):
        beyes = GaussianNB()
        beyes.fit(self.x_train, self.y_train)
        self.beyes_pred = beyes.predict(self.x_test)
        self.save_pic("beyes", self.beyes_pred)

    def DT(self):
        dt = tree.DecisionTreeClassifier(criterion="entropy")
        dt.fit(self.x_train, self.y_train)
        self.DT_pred = dt.predict(self.x_test)
        self.save_pic("DT", self.DT_pred)

    def AdaBoost(self, n_estimators=100):
        AB = AdaBoostClassifier(n_estimators=n_estimators)
        AB.fit(self.x_train, self.y_train)
        self.AdaBoost_pred = AB.predict(self.x_test)
        self.save_pic("AdaBoost", self.AdaBoost_pred)

    def RF(self):
        RF = RandomForestClassifier(criterion='entropy', n_estimators=10, random_state=1, n_jobs=2)
        RF.fit(self.x_train, self.y_train)
        self.RF_pred = RF.predict(self.x_test)
        self.save_pic("RF", self.RF_pred)

    def LR(self):
        LR = LogisticRegression(solver='liblinear')
        LR.fit(self.x_train, self.y_train)
        self.LR_pred = LR.predict(self.x_test)
        self.save_pic("LR", self.LR_pred)

    def GBDT(self):
        GBDT = GradientBoostingRegressor()
        GBDT.fit(self.x_train, self.y_train)
        self.GBDT_pred = GBDT.predict(self.x_test)
        self.GBDT_pred = np.asarray(self.GBDT_pred, dtype=int)
        self.save_pic("GBDT", self.GBDT_pred)

    def Evaluation_indicators(self, stri, y_pred):
        return [stri,
                round(accuracy_score(self.y_test, y_pred), 3),
                round(precision_score(self.y_test, y_pred, average="macro"), 3),
                round(recall_score(self.y_test, y_pred, average="micro"), 3),
                round(f1_score(self.y_test, y_pred, average="weighted"), 3)]

    def save_pic(self, stri, y_pred):
        plt.title(stri)
        plt.scatter(self.x_test[:, 0], self.x_test[:, 1], c=y_pred)
        plt.savefig("image/"+stri+".png", dpi=55)



Clusterer.py

# K-means、
# 层次聚类BIRCH、
# 密度聚类DBSCAN算法、
# 高斯混合聚类GMM、
# 密度聚类的OPTICS算法、
# Mean Shift聚类算法
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.cluster import KMeans, Birch, DBSCAN, OPTICS, MeanShift
from sklearn.mixture import GaussianMixture
import sklearn
from sklearn import metrics
from sklearn.metrics import accuracy_score
from sklearn.metrics import homogeneity_completeness_v_measure
import numpy as np


# 计算纯度
def purity_score(y_true, y_pred):
    y_voted_labels = np.zeros(y_true.shape)
    labels = np.unique(y_true)
    ordered_labels = np.arange(labels.shape[0])
    for k in range(labels.shape[0]):
        y_true[y_true == labels[k]] = ordered_labels[k]
    labels = np.unique(y_true)
    bins = np.concatenate((labels, [np.max(labels) + 1]), axis=0)
    for cluster in np.unique(y_pred):
        hist, _ = np.histogram(y_true[y_pred == cluster], bins=bins)
        winner = np.argmax(hist)
        y_voted_labels[y_pred == cluster] = winner
    return accuracy_score(y_true, y_voted_labels)


class Cluser:
    def __init__(self, k, data, y_true):
        self.K = k
        self.data = data
        self.y_true = y_true

        self.kmeams_pred, self.birch_pred, self.dbscan_pred, self.gmm_pred, self.optics_pred, self.MS_pred = \
            0, 0, 0, 0, 0, 0

    def K_means(self):
        kmeans = KMeans(n_clusters=self.K)
        self.kmeams_pred = kmeans.fit_predict(self.data)
        self.save_pic("K_means", self.kmeams_pred)

    def BIRCH(self):
        model = Birch(n_clusters=self.K)
        self.birch_pred = model.fit_predict(self.data)
        self.save_pic("BIRCH", self.birch_pred)

    def DBSCAN(self):
        model = DBSCAN()
        self.dbscan_pred = model.fit_predict(self.data)
        self.save_pic("DBSCAN", self.dbscan_pred)

    def GMM(self):
        model = GaussianMixture(n_init=3)
        self.gmm_pred = model.fit_predict(self.data)
        self.save_pic("GMM", self.gmm_pred)

    def OPTICS(self):
        model = OPTICS()
        self.optics_pred = model.fit_predict(self.data)
        self.save_pic("OPTICS", self.optics_pred)

    def Mean_Shift(self):
        model = MeanShift()
        self.MS_pred = model.fit_predict(self.data)
        self.save_pic("Mean_Shift", self.MS_pred)

    def Evaluation_indicators(self, stri, y_pred):
        h_c_v = homogeneity_completeness_v_measure(self.y_true, y_pred)
        return [stri,
                round(purity_score(self.y_true, y_pred), 3),
                round(metrics.adjusted_rand_score(self.y_true, y_pred), 3),
                round(sklearn.metrics.f1_score(self.y_true, y_pred, average='micro'), 3),
                round(metrics.mutual_info_score(self.y_true, y_pred), 3),
                round(h_c_v[0], 3),
                round(h_c_v[1], 3),
                round(h_c_v[2], 3)]

    def save_pic(self, stri, y_pred):
        plt.title(stri)
        plt.scatter(self.data[:, 0], self.data[:, 1], c=y_pred)
        plt.savefig("image/"+stri+".png", dpi=55)


forecast.py

# 贝叶斯网络、
# 马尔科夫模型、
# 线性回归、
# XGBoost、
# 岭回归、
# 多项式回归、
# 决策树回归、

import numpy as np
import xgboost
from hmmlearn.hmm import GaussianHMM
from matplotlib import pyplot as plt
from sklearn import linear_model, metrics
import sklearn.pipeline as pl
import sklearn.preprocessing as sp
import sklearn.linear_model as lm
from sklearn.linear_model import LinearRegression, BayesianRidge
from sklearn.tree import DecisionTreeRegressor


# 贝叶斯,线性
def mape(y_true, y_pred):
    return np.mean(np.abs((y_pred - y_true) / y_true)) * 100


def smape(y_true, y_pred):
    return 2.0 * np.mean(np.abs(y_pred - y_true) / (np.abs(y_pred) + np.abs(y_true))) * 100


class Forecast(object):
    def __init__(self, x_train, y_train, x_test, y_test):
        self.x_train = x_train
        self.y_train = y_train
        self.x_test = x_test
        self.y_test = y_test

        self.xgb_pred, self.LR_pred, self.DT_pred, self.polynomial_pred, self.RidgeCv_pred, self.byes_pred, \
        self.markov_pred = 0, 0, 0, 0, 0, 0, 0

    # XGBoost、
    def XGBoost(self):
        bst = xgboost.XGBClassifier()
        bst.fit(self.x_train, self.y_train)
        self.xgb_pred = bst.predict(self.x_test)
        self.save_pic("XGBoost", self.xgb_pred)

    # 线性回归、
    def LR(self):
        model = LinearRegression()
        model.fit(self.x_train, self.y_train)
        model.score(self.x_test, self.y_test)
        self.LR_pred = model.predict(self.x_test)
        self.save_pic("LR", self.LR_pred)

    # 决策树回归
    def DT(self):
        model = DecisionTreeRegressor(max_depth=5)
        model.fit(self.x_train, self.y_train)
        self.DT_pred = model.predict(self.x_test)
        self.save_pic("DT", self.DT_pred)

    # 多项式回归
    def polynomial(self):
        model = pl.make_pipeline(
            sp.PolynomialFeatures(10),  # 多项式特征扩展器
            lm.LinearRegression())  # 线性回归器
        model.fit(self.x_train, self.y_train)
        self.polynomial_pred = model.predict(self.x_test)
        self.save_pic("polynomial", self.polynomial_pred)

    # 岭回归
    def RidgeCv(self):
        model = linear_model.RidgeCV()
        model.fit(self.x_train, self.y_train)
        model.score(self.x_test, self.y_test)
        self.RidgeCv_pred = model.predict(self.x_test)
        self.save_pic("RidgeCv", self.RidgeCv_pred)

    # 贝叶斯网络、
    def byes(self):
        mnb = BayesianRidge()  # 使用默认配置初始化朴素贝叶斯
        mnb.fit(self.x_train, self.y_train)
        self.byes_pred = mnb.predict(self.x_test)
        self.save_pic("byes", self.byes_pred)

    # 马尔科夫模型、
    def Markov(self):
        model = GaussianHMM(n_components=3, covariance_type='diag', n_iter=1000)
        model.fit(self.x_train)
        self.markov_pred = model.predict(self.x_test)
        self.save_pic("Markov", self.markov_pred)

    def Evaluation_indicators(self, stri, y_pred):
        return [stri,
                round(metrics.mean_squared_error(self.y_test, y_pred), 3),
                round(np.sqrt(metrics.mean_squared_error(self.y_test, y_pred)), 3),
                round(metrics.mean_absolute_error(self.y_test, y_pred), 3),
                round(mape(self.y_test, y_pred), 3),
                round(smape(self.y_test, y_pred), 3)]

    def save_pic(self, stri, y_pred):
        plt.title(stri)
        plt.plot(np.arange(len(y_pred)), self.y_test, 'go-', label='test value')
        plt.plot(np.arange(len(y_pred)), y_pred, 'ro-', label='predict value')
        plt.savefig("image/" + stri + ".png", dpi=55)


Button_classfiy.py

from classfiy import *
import pandas as pd
from sklearn.datasets import *
from sklearn.model_selection import train_test_split


class ButtonClassfiy(object):
    def __init__(self, ifSelect, dataOption, e_value):
        self.ifSelect = ifSelect
        self.dataOption = dataOption
        self.e_value = e_value
        self.data = 0
        self.target = 0
        self.x_train = 0
        self.y_train = 0
        self.x_test = 0
        self.y_test = 0
        self.result = []

    def get_data(self):
        if self.dataOption == "iris":
            self.data = load_iris()["data"]
            self.target = load_iris()["target"]
        elif self.dataOption == "wine_data":
            self.data = load_wine()["data"]
            self.target = load_wine()["target"]
        elif self.dataOption == "breast_cancer":
            self.data = load_breast_cancer()["data"]
            self.target = load_breast_cancer()["target"]
        else:
            self.data = pd.read_csv(self.e_value)

        self.x_train, self.x_test, self.y_train, self.y_test = train_test_split(
            self.data, self.target, test_size=0.30, random_state=42)

    def run(self):
        clf = Classfiy(self.x_train, self.y_train, self.x_test, self.y_test)
        if self.ifSelect[0]:
            clf.KNN()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("KNN"), clf.KNN_pred))
        if self.ifSelect[1]:
            clf.beyes()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("贝叶斯分类器"), clf.beyes_pred))
        if self.ifSelect[2]:
            clf.DT()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("决策树"), clf.DT_pred))
        if self.ifSelect[3]:
            clf.AdaBoost()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("AdaBoost"), clf.AdaBoost_pred))
        if self.ifSelect[4]:
            clf.GBDT()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("GBDT"), clf.GBDT_pred))
        if self.ifSelect[5]:
            clf.RF()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("随机森林"), clf.RF_pred))
        if self.ifSelect[6]:
            clf.LR()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("逻辑回归"), clf.LR_pred))

        return self.result

Button_cluster.py

from Clusterer import *
from sklearn.datasets import *
from sklearn.model_selection import train_test_split


class ButtonCluster(object):
    def __init__(self, ifSelect, dataOption, e_value):
        self.ifSelect = ifSelect
        self.dataOption = dataOption
        self.e_value = e_value
        self.data = 0
        self.target = 0
        self.x_train = 0
        self.y_train = 0
        self.x_test = 0
        self.y_test = 0
        self.result = []

    def get_data(self):
        if self.dataOption == "iris":
            self.data = load_iris()["data"]
            self.target = load_iris()["target"]
        elif self.dataOption == "wine_data":
            self.data = load_wine()["data"]
            self.target = load_wine()["target"]
        elif self.dataOption == "breast_cancer":
            self.data = load_breast_cancer()["data"]
            self.target = load_breast_cancer()["target"]
        else:
            self.data = pd.read_csv(self.e_value)

        self.x_train, self.x_test, self.y_train, self.y_test = train_test_split(
            self.data, self.target, test_size=0.30, random_state=42)

    def run(self):
        clf = Cluser(3, self.x_train, self.y_train)
        if self.ifSelect[0]:
            clf.K_means()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("K-means"), clf.kmeams_pred))
        if self.ifSelect[1]:
            clf.BIRCH()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("BIRCH"), clf.birch_pred))
        if self.ifSelect[2]:
            clf.DBSCAN()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("DBSCAN"), clf.dbscan_pred))
        if self.ifSelect[3]:
            clf.GMM()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("GMM"), clf.gmm_pred))
        if self.ifSelect[4]:
            clf.OPTICS()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("OPTICS"), clf.optics_pred))
        if self.ifSelect[5]:
            clf.Mean_Shift()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("Mean_Shift"), clf.MS_pred))
        return self.result

Button_forecast.py

from forecast import *
from sklearn.datasets import *
from sklearn.model_selection import train_test_split


class ButtonForecast(object):
    def __init__(self, ifSelect, dataOption, e_value):
        self.ifSelect = ifSelect
        self.dataOption = dataOption
        self.e_value = e_value
        self.data = 0
        self.target = 0
        self.x_train = 0
        self.y_train = 0
        self.x_test = 0
        self.y_test = 0
        self.result = []

    def get_data(self):
        if self.dataOption == "iris":
            self.data = load_iris()["data"]
            self.target = load_iris()["target"]
        elif self.dataOption == "wine_data":
            self.data = load_boston()["data"]
            self.target = load_boston()["target"]
        elif self.dataOption == "breast_cancer":
            self.data = load_boston()["data"]
            self.target = load_boston()["target"]
        else:
            self.data = pd.read_csv(self.e_value)

        self.x_train, self.x_test, self.y_train, self.y_test = train_test_split(
            self.data, self.target, test_size=0.30, random_state=42)

    # 贝叶斯网络、
    # 马尔科夫模型、
    # 线性回归、
    # XGBoost、
    # 岭回归、
    # 多项式回归、
    # 决策树回归、
    def run(self):
        clf = Forecast(self.x_train, self.y_train, self.x_test, self.y_test)
        if self.ifSelect[0]:
            clf.byes()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("贝叶斯网络"), clf.byes_pred))
        if self.ifSelect[1]:
            clf.Markov()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("马尔科夫模型"), clf.markov_pred))
        if self.ifSelect[2]:
            clf.LR()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("线性回归"), clf.LR()))
        if self.ifSelect[3]:
            clf.XGBoost()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("XGBoost"), clf.xgb_pred))
        if self.ifSelect[4]:
            clf.RidgeCv()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("岭回归"), clf.RidgeCv_pred))
        if self.ifSelect[5]:
            clf.polynomial()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("多项式回归"), clf.polynomial_pred))
        if self.ifSelect[6]:
            clf.DT()
            self.result.append(clf.Evaluation_indicators("{:<6}".format("决策树回归"), clf.DT_pred))
        return self.result


UI_classfiy.py

import tkinter as tk
from tkinter import *
from Button_classfiy import ButtonClassfiy
import tkinter.messagebox

window = tk.Tk()
window.title("machine learning")
window.geometry("800x800+50+50")  # 窗口大小
data_option = StringVar()
data_option.set("iris")
ifSelect = {}
diy_label = Label()

# 窗口布局
select = tk.Frame(window, height=450, width=250, bg='green').place(x=10, y=50)
data = tk.Frame(window, height=250, width=250, bg='red').place(x=10, y=530)
text = Text(window, width=62, height=10)
text.place(x=320, y=50)
# 提示语
ChooseLabel = tk.Label(window, text="Please select a classification algorithm", font=('微软雅黑', 12)).place(x=10, y=10)
resultLabel = tk.Label(window, text="Results of the selected algorithm classification",
                       font=('微软雅黑', 12)).place(x=330, y=10)
e = tk.Entry(data)

canvas = tk.Canvas(window, height=260, width=440)
image_flie = tk.PhotoImage(file="image/1.png")
image = canvas.create_image(0, 0, anchor="nw", image=image_flie)
canvas.place(x=320, y=230)
pic_names = ["1.png"]
pic_name = pic_names[0]
pic_index = 0


# 切换图片
def swicth_pic(pic_name1):
    global image, image_flie, pic_index, pic_names, pic_name
    image_flie = tk.PhotoImage(file='image/'+pic_name1)
    image = canvas.create_image(0, 0, anchor="nw", image=image_flie)
    pic_index += 1
    if pic_index >= len(pic_names):
        pic_index = 0
    pic_name = pic_names[pic_index]


Button(window, text="next pic", font=('微软雅黑', 8), command=lambda: swicth_pic(pic_name), bg='gray').place(x=650, y=200)


# 自定义文件路径
def diy_data():
    global diy_label
    # 用户自己输入数据
    diy_label = Label(data, text="请输入数据路径:", font=('微软雅黑', 10), bg="red")
    diy_label.place(x=10, y=700)
    e.place(x=110, y=703)  # 若要显示 则show=None


def delete_diy():
    global diy_label
    diy_label.place_forget()
    e.place_forget()


def reflush():
    global ifSelect, data_option
    # 选择算法复选框
    algorithm = {0: 'k-近邻算法', 1: '贝叶斯分类器', 2: '决策树分类', 3: 'AdaBoost', 4: 'GBDT', 5: '随机森林', 6: '逻辑回归'}
    # 判断是否选择

    for i in range(len(algorithm)):
        ifSelect[i] = BooleanVar()
        Checkbutton(select, text=algorithm[i], font=('微软雅黑', 12), variable=ifSelect[i], bg='green') \
            .place(x=30, y=80 + i * 55, anchor="nw")

    # 数据层
    # 设置单选层,内置小数据、中数据、大数据

    tk.Radiobutton(window, text="小数据", variable=data_option, value="iris", bg='red', command=delete_diy) \
        .place(x=30, y=550)
    tk.Radiobutton(window, text="中数据", variable=data_option, value="wine_data", bg='red', command=delete_diy) \
        .place(x=30, y=580)
    tk.Radiobutton(window, text="大数据", variable=data_option, value="breast_cancer", bg="red", command=delete_diy) \
        .place(x=30, y=610)
    tk.Radiobutton(window, text="diy数据", variable=data_option, value="diy", command=diy_data, bg='red') \
        .place(x=30, y=640)


reflush()


# 到这里我们所需要的数据都可以拿到了
# 这部分我们加到运行命令下面


def btn_f():
    global pic_names, pic_name
    for i in range(7):
        ifSelect[i] = ifSelect[i].get()

    f = 1
    for i in range(7):
        if ifSelect[i]:
            f = 0
    if f:
        tkinter.messagebox.showerror("错误", "你没有选择任何算法")
        pic_names = ["image/1.png"]
        quit()

    e_value = e.get()
    pic_names = []
    for i in range(7):
        if ifSelect[i]:
            pic_names.append(["KNN.png", "beyes.png", "DT.png", "AdaBoost.png", "GBDT.png", "RF.png", "LR.png"][i])
    pic_name = pic_names[0]
    # 进行计算
    btnf = ButtonClassfiy(ifSelect, data_option.get(), e_value)
    btnf.get_data()
    result = btnf.run()
    show_ = "算法名称    精确率  准确率  召回率  f1-score\n"
    for i in result:
        show_ = show_ + str(i) + '\n'
    text.delete('1.0', 'end')
    text.insert(INSERT, show_, "软体雅黑")
    reflush()


# 运行按钮

Button(window, text="run it", font=('微软雅黑', 80), command=lambda: btn_f(), bg='gray').place(x=350, y=530)
window.mainloop()  # 点击时循环更新数据


UI_Cluster.py

import tkinter as tk
from tkinter import *
from Button_cluster import ButtonCluster
import tkinter.messagebox

window = tk.Tk()
window.title("machine learning")
window.geometry("800x800+50+50")  # 窗口大小
data_option = StringVar()
data_option.set("iris")
ifSelect = {}
diy_label = Label()

# 窗口布局
select = tk.Frame(window, height=450, width=250, bg='green').place(x=10, y=50)
data = tk.Frame(window, height=250, width=250, bg='red').place(x=10, y=530)
text = Text(window, width=62, height=10)
text.place(x=320, y=50)
# 提示语
ChooseLabel = tk.Label(window, text="Please select a clustering algorithm", font=('微软雅黑', 12)).place(x=10, y=10)
resultLabel = tk.Label(window, text="The clustering results are as follows",
                       font=('微软雅黑', 12)).place(x=330, y=10)
e = tk.Entry(data)
# 画布
canvas = tk.Canvas(window, height=260, width=440)
image_flie = tk.PhotoImage(file="image/1.png")
image = canvas.create_image(0, 0, anchor="nw", image=image_flie)
canvas.place(x=320, y=230)
pic_names = ["1.png"]
pic_name = pic_names[0]
pic_index = 0


# 切换图片
def swicth_pic(pic_name1):
    global image, image_flie, pic_index, pic_names, pic_name
    image_flie = tk.PhotoImage(file='image/'+pic_name1)
    image = canvas.create_image(0, 0, anchor="nw", image=image_flie)
    pic_index += 1
    if pic_index >= len(pic_names):
        pic_index = 0
    pic_name = pic_names[pic_index]


Button(window, text="next pic", font=('微软雅黑', 8), command=lambda: swicth_pic(pic_name), bg='gray').place(x=650, y=200)


def diy_data():
    global diy_label
    # 用户自己输入数据
    diy_label = Label(data, text="请输入数据路径:", font=('微软雅黑', 10), bg="red")
    diy_label.place(x=10, y=700)
    e.place(x=110, y=703)  # 若要显示 则show=None


def delete_diy():
    global diy_label
    diy_label.place_forget()
    e.place_forget()


def reflush():
    global ifSelect, data_option
    # 选择算法复选框
    algorithm = {0: 'K-means', 1: 'BIRCH', 2: 'DBSCAN', 3: 'GMM', 4: 'OPTICS', 5: 'Mean Shift'}
    # 判断是否选择

    for i in range(len(algorithm)):
        ifSelect[i] = BooleanVar()
        Checkbutton(select, text=algorithm[i], font=('微软雅黑', 12), variable=ifSelect[i], bg='green') \
            .place(x=30, y=80 + i * 55, anchor="nw")

    # 数据层
    # 设置单选层,内置小数据、中数据、大数据

    tk.Radiobutton(window, text="小数据", variable=data_option, value="iris", bg='red', command=delete_diy) \
        .place(x=30, y=550)
    tk.Radiobutton(window, text="中数据", variable=data_option, value="wine_data", bg='red', command=delete_diy) \
        .place(x=30, y=580)
    tk.Radiobutton(window, text="大数据", variable=data_option, value="breast_cancer", bg="red", command=delete_diy) \
        .place(x=30, y=610)
    tk.Radiobutton(window, text="diy数据", variable=data_option, value="diy", command=diy_data, bg='red') \
        .place(x=30, y=640)


reflush()


# 到这里我们所需要的数据都可以拿到了
# 这部分我们加到运行命令下面


def btn_f():
    global pic_names, pic_name
    for i in range(6):
        ifSelect[i] = ifSelect[i].get()
    f = 1
    for i in range(6):
        if ifSelect[i]:
            f = 0
    if f:
        tkinter.messagebox.showerror("错误", "你没有选择任何算法")
        pic_names = ["image/1.png"]
        quit()

    e_value = e.get()
    pic_names = []
    for i in range(6):
        if ifSelect[i]:
            pic_names.append(["K_means.png", "BIRCH.png", "DBSCAN.png", "GMM.png", "OPTICS.png", "Mean_Shift.png"][i])
    pic_name = pic_names[0]
    # 进行计算
    btnf = ButtonCluster(ifSelect, data_option.get(), e_value)
    btnf.get_data()
    result = btnf.run()
    show_ = "算法名称 纯度 调整兰德系数 f1-score 互信息 同质性 完整性 调和平均\n"
    for i in result:
        show_ = show_ + str(i) + '\n'
    text.delete('1.0', 'end')
    text.insert(INSERT, show_, "软体雅黑")
    reflush()


# 运行按钮

Button(window, text="run it", font=('微软雅黑', 80), command=lambda: btn_f(), bg='gray').place(x=350, y=530)
window.mainloop()  # 点击时循环更新数据


UI_forecast.py

import tkinter as tk
from tkinter import *
from Button_forecast import ButtonForecast
import tkinter.messagebox

window = tk.Tk()
window.title("machine learning")
window.geometry("800x800+50+50")  # 窗口大小
data_option = StringVar()
data_option.set("iris")
ifSelect = {}
diy_label = Label()

# 窗口布局
select = tk.Frame(window, height=450, width=250, bg='green').place(x=10, y=50)
data = tk.Frame(window, height=250, width=250, bg='red').place(x=10, y=530)
text = Text(window, width=62, height=10)
text.place(x=320, y=50)
# 提示语
ChooseLabel = tk.Label(window, text="Please select a prediction algorithm", font=('微软雅黑', 12)).place(x=10, y=10)
resultLabel = tk.Label(window, text="The predicted results are as follows", font=('微软雅黑', 12)).place(x=330, y=10)
e = tk.Entry(data)

canvas = tk.Canvas(window, height=260, width=440)
image_flie = tk.PhotoImage(file="image/1.png")
image = canvas.create_image(0, 0, anchor="nw", image=image_flie)
canvas.place(x=320, y=230)
pic_names = ["image/1.png"]
pic_name = pic_names[0]
pic_index = 0


# 切换图片
def swicth_pic(pic_name1):
    global image, image_flie, pic_index, pic_names, pic_name
    image_flie = tk.PhotoImage(file='image/' + pic_name1)
    image = canvas.create_image(0, 0, anchor="nw", image=image_flie)
    pic_index += 1
    if pic_index >= len(pic_names):
        pic_index = 0
    pic_name = pic_names[pic_index]


Button(window, text="next pic", font=('微软雅黑', 8), command=lambda: swicth_pic(pic_name), bg='gray').place(x=650, y=200)


def diy_data():
    global diy_label
    # 用户自己输入数据
    diy_label = Label(data, text="请输入数据路径:", font=('微软雅黑', 10), bg="red")
    diy_label.place(x=10, y=700)
    e.place(x=110, y=703)  # 若要显示 则show=None


def delete_diy():
    global diy_label
    diy_label.place_forget()
    e.place_forget()


def reflush():
    global ifSelect, data_option
    # 选择算法复选框
    algorithm = {0: '贝叶斯网络', 1: '马尔科夫模型', 2: '线性回归', 3: 'XGBoost', 4: '岭回归', 5: '多项式回归', 6: '决策树回归'}
    # 判断是否选择

    for i in range(len(algorithm)):
        ifSelect[i] = BooleanVar()
        Checkbutton(select, text=algorithm[i], font=('微软雅黑', 12), variable=ifSelect[i], bg='green') \
            .place(x=30, y=80 + i * 55, anchor="nw")

    # 数据层
    # 设置单选层,内置小数据、中数据、大数据

    tk.Radiobutton(window, text="小数据", variable=data_option, value="iris", bg='red', command=delete_diy) \
        .place(x=30, y=550)
    tk.Radiobutton(window, text="中数据", variable=data_option, value="wine_data", bg='red', command=delete_diy) \
        .place(x=30, y=580)
    tk.Radiobutton(window, text="大数据", variable=data_option, value="breast_cancer", bg="red", command=delete_diy) \
        .place(x=30, y=610)
    tk.Radiobutton(window, text="diy数据", variable=data_option, value="diy", command=diy_data, bg='red') \
        .place(x=30, y=640)


reflush()


# 到这里我们所需要的数据都可以拿到了
# 这部分我们加到运行命令下面


def btn_f():
    global pic_names, pic_name
    for i in range(7):
        ifSelect[i] = ifSelect[i].get()
    f = 1
    for i in range(7):
        if ifSelect[i]:
            f = 0
    if f:
        tkinter.messagebox.showerror("错误", "你没有选择任何算法")
        pic_names = ["image/1.png"]
        quit()

    e_value = e.get()
    pic_names = []
    for i in range(7):
        if ifSelect[i]:
            pic_names.append(["byes.png", "Markov.png", "LR.png", "XGBoost.png", "RidgeCv.png", "polynomial.png",
                              "DT.png"][i])
    pic_name = pic_names[0]
    # 进行计算
    btnf = ButtonForecast(ifSelect, data_option.get(), e_value)
    btnf.get_data()
    result = btnf.run()
    show_ = "算法名称    MSE  RMSE  MAE  MAPE  SMAPE\n"
    for i in result:
        show_ = show_ + str(i) + '\n'
    text.delete('1.0', 'end')
    text.insert(INSERT, show_, "软体雅黑")
    reflush()


# 运行按钮
Button(window, text="run it", font=('微软雅黑', 80), command=lambda: btn_f(), bg='gray').place(x=350, y=530)
window.mainloop()  # 点击时循环更新数据

机器学习是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。它专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径。 随着统计学的发展,统计学习在机器学习中占据了重要地位,支持向量机(SVM)、决策树和随机森林等算法的提出和发展,使得机器学习能够更好地处理分类、回归和聚类等任务。进入21世纪,深度学习成为机器学习领域的重要突破,采用多层神经网络模型,通过大量数据和强大的计算能力来训练模型,在计算机视觉、自然语言处理和语音识别等领域取得了显著的成果。 机器学习算法在各个领域都有广泛的应用,包括医疗保健、金融、零售和电子商务、智能交通、生产制造等。例如,在医疗领域,机器学习技术可以帮助医生识别医疗影像,辅助诊断疾病,预测病情发展趋势,并为患者提供个性化的治疗方案。在金融领域,机器学习模型可以分析金融数据,识别潜在风险,预测股票市场的走势等。 未来,随着传感器技术和计算能力的提升,机器学习将在自动驾驶、智能家居等领域发挥更大的作用。同时,随着物联网技术的普及,机器学习将助力智能家居备实现更加智能化和个性化的功能。在工业制造领域,机器学习也将实现广泛应用,如智能制造、工艺优化和质量控制等。 总之,机器学习是一门具有广阔应用前景和深远影响的学科,它将持续推动人工智能技术的发展,为人类社会的进步做出重要贡献。
机器学习是一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。它专门研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。机器学习是人工智能的核心,是使计算机具有智能的根本途径。 随着统计学的发展,统计学习在机器学习中占据了重要地位,支持向量机(SVM)、决策树和随机森林等算法的提出和发展,使得机器学习能够更好地处理分类、回归和聚类等任务。进入21世纪,深度学习成为机器学习领域的重要突破,采用多层神经网络模型,通过大量数据和强大的计算能力来训练模型,在计算机视觉、自然语言处理和语音识别等领域取得了显著的成果。 机器学习算法在各个领域都有广泛的应用,包括医疗保健、金融、零售和电子商务、智能交通、生产制造等。例如,在医疗领域,机器学习技术可以帮助医生识别医疗影像,辅助诊断疾病,预测病情发展趋势,并为患者提供个性化的治疗方案。在金融领域,机器学习模型可以分析金融数据,识别潜在风险,预测股票市场的走势等。 未来,随着传感器技术和计算能力的提升,机器学习将在自动驾驶、智能家居等领域发挥更大的作用。同时,随着物联网技术的普及,机器学习将助力智能家居备实现更加智能化和个性化的功能。在工业制造领域,机器学习也将实现广泛应用,如智能制造、工艺优化和质量控制等。 总之,机器学习是一门具有广阔应用前景和深远影响的学科,它将持续推动人工智能技术的发展,为人类社会的进步做出重要贡献。
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值