dataframe 输出标题_【学界】第八章：Python代码之数据输出、调参与算法总结

最新推荐文章于 2024-05-03 18:41:34 发布

weixin_39988930

最新推荐文章于 2024-05-03 18:41:34 发布

阅读量166

点赞数

文章标签： dataframe 输出标题

本文链接：https://blog.csdn.net/weixin_39988930/article/details/111629396

版权

作者：小杨
学校：广东工业大学
年级：研二
专业：工业工程
主要研究兴趣：强化学习、深度学习
简介：作者是广东工业大学2016级工业工程系研究生，师从广东工业大学教授、博士生导师、《工业工程》杂志主编，陈庆新教授，广东工业大学讲师、港大博士，张婷。于研一至研二第一学期初次进行学术研究，发表一篇被EI期刊、国内Top工程期刊《系统工程理论与实践》收录，两篇国际学术会议文章（International Conference on Networking, Sensing and Control 2018，Healthcare Operations Management SIG 2018 Conference），完成一个国际会议poster演讲（Manufacturing and Service Operations Management Conference 2018 ），投稿一篇文章至Journal of Clean Production正在初审。目前研究兴趣主要集中在机器学习与运筹学的交叉应用领域。

这系列文章是个人的一些体会，也是对上一阶段工作的总结。由于是初次涉水，阅历尚浅，希望通过总结来反省，希望收获大神的意见~

研究课题：

[ 社区居家养老调度与路径规划问题 ]

[ Community Home Health Care Scheduling and Routing Problem ]

------本文目录------

第一、二、三章：论文选题、文献综述方法、问题分析、与模型假设

第四章：马氏决策过程&机会约束模型&多人随机路径规划

第五章：Python代码之模型与Q-Learning

第六章：Python代码之蚁群算法部分

第七章：Python代码之模块整合与主程序

------8 数据输出------

8.1 数据存入excel文件

首先肯定要把解的信息储存入一个表格里。

Solution information in Excel

由于数据比较复杂，不能用numpy.savetxt()一次性解决，所以就用xlwt和pandas来新建xls文件并写入数据。在Solution类里面添加两个函数。

使用说明书：

pandas.DataFrame

pandas.DataFrame.to_excel

import pandas as pd
import xlwt

class Solution:
    __slots__ = [..., 'info_to_excel']

    def __init__(self):
        ...
        # Information to excel
        self.info_to_excel = []

    def calculate(self, q_learning_result, remaining_demand):
        ...

    def get_solution_info(self):
        indexes = []
        skill_level = []
        workload = []
        waiting = []
        avg_waiting = []
        route_e = []
        route_j = []
        fulfilled_d = []
        # Stack solution information by columns
        for i in range(len(self.nl)):
            indexes.append(self.nl[i].l)
            skill_level.append(self.nl[i].s)
            workload.append(self.nl[i].tt)
            waiting.append(self.nl[i].twt)
            avg_waiting.append(self.nl[i].avg_w)
            e = []
            j = []
            for r in range(len(self.nl[i].r)):
                e.append(self.nl[i].r[r].e)
                j.append(self.nl[i].r[r].l)
            route_e.append(e)
            route_j.append(j)
            fulfilled_d.append(self.nl[i].sd)
        # return organized information as a list
        self.info_to_excel = copy.deepcopy([indexes, skill_level, workload, waiting, avg_waiting, route_e, route_j, fulfilled_d])

    def save_solution_info(self):
        # save solution information into excel file
        exc = xlwt.Workbook()
        file_name = 'solutions.xls'
        exc.add_sheet("Sheet1")
        exc.save(file_name)
        writer = pd.ExcelWriter(file_name, sheet_name='Sheet1')
        solution_df = pd.DataFrame({'Nurses': self.info_to_excel[0],
                                    'Skill': self.info_to_excel[1],
                                    'Workload': self.info_to_excel[2],
                                    'Waiting': self.info_to_excel[3],
                                    'Average waiting': self.info_to_excel[4],
                                    'Routes by elder': self.info_to_excel[5],
                                    'Routes by job': self.info_to_excel[6],
                                    'Fulfilled demands': self.info_to_excel[7]})
        solution_df.to_excel(writer, sheet_name='Sheet1', index=False)
        writer.save()

关于save_solution_infor()

*to_excel()方法是一列一列记录数据的，所以在get_solution_info()中要以列为单位建立列表，每一列都作为数组的一个元素，建立一个二元数组Solution.info_to_excel = [[...]]

*先用xlwt的workbook()方法新建一个Excel文件

*然后用pandas的ExcelWriter()方法向该文件里写入数据

*由于默认的index是从0排列的，所以用nurse的label属性建立索引，而to_excel的index选项设置为False

关于create_fig_text()

*这是城市

8.2 画图

那么除了以上两个评价函数用来画图，为了直观的看到实验结果，我们还可以输出到达时刻图、算法耗时图等等。还是要写个函数来调用，程序员就得尽量偷懒，啊不，优雅。

输入：
文件标题
x、y轴的标题
x轴的大小（默认以1为间隔画轴，所以x要为整数）
y轴数据
图说明文字开始的x坐标
图的说明文字

matplotlib.pyplot.savefig - Matplotlib 2.2.3 documentation

matplotlib.pyplot.text - Matplotlib 2.2.3 documentation

**kwargs of matplotlib text - Matplotlib 2.2.3 documentation

import matplotlib.pyplot as plt

def plot(figure_name, y_label, x_label, x, y, figure_text):
    y_lower = min(y) - 5
    y_upper = max(y) + 5
    plt.ylabel(y_label)
    plt.xlabel(x_label)
    plt.xlim(0, x)
    plt.ylim(y_lower, y_upper)
    plt.plot(y)
    # Figure explanations
    plt.text(x-1, y_lower+1, figure_text, fontsize=9, va="baseline", ha="right")
    name = figure_name + '.png'
    plt.savefig(name, dpi=900, bbox_inches='tight')
    plt.close()

*注意我这里设置y轴的上下限是基于y轴数据点的，这样可以减少图的空白区域，可做调整

*plt.text()方法里的va="baseline"和ha="right"是右下对齐的方式，以(x-1, y_lower+1)为坐标定位文本块，是属于**kwargs里面的参数。继承至matplotlib.artist.Artist类：

*遗留一个问题，savefig()中bbox_inches="tight"具体用处不是很懂，有参考链接，希望有朋友解惑：Figure.savefig() not respecting bbox_inches='tight' when dpi specified · Issue #7820 · matplotlib/matplotlib

然后在Solution类里添加一个内部函数，用来生成图片的题注。

*首先我们有两个过程需要画图，第一是训练过程，第二是测试过程，那么训练过程我们需要记录算法花了多长时间来训练，所以输出时要记录下running_time，但是训练完毕，测试Q矩阵的时候，就不用去记录running_time啦（时间真的很短）。

import pandas as pd
import xlwt

class Solution:
    __slots__ = [..., 'info_to_excel'， 'fig_t']

    def __init__(self):
        ...
        # Figure test
        self.fig_t = ''

    def calculate(self, q_learning_result, remaining_demand):
        ...
    def get_solution_info(self):
    ...
    def save_solution_info(self):
    ...
    def create_fig_text(self, running_time):
        if running_time != 0:
            self.fig_t = "n" 
                         "Running time: " + str(float('%.2f' % running_time)) + " mins n" 
                         "Results of best solution found:n" 
                         "Solution evaluation value = " + str(self.ev) + "n" 
                         "Solution avg workload of nurse = " + str(self.a_work_n) + "n" 
                         "Solution avg waiting time of job = " + str(self.a_wait_j) + "n" 
                         "Solution avg waiting time of nurse = " + str(self.a_wait_n) + "n" 
                         "Solution total workload = " + str(self.t_work) + "n" 
                         "Solution total waiting time = " + str(self.t_wait) + "n" 
                         "Solution remaining demands = " + str(self.rd)
        else:
            self.fig_t = "n" 
                         "Results of testing the trained agent:n" 
                         "Solution evaluation value = " + str(self.ev) + "n" 
                         "Solution avg workload of nurse = " + str(self.a_work_n) + "n" 
                         "Solution avg waiting time of job = " + str(self.a_wait_j) + "n" 
                         "Solution avg waiting time of nurse = " + str(self.a_wait_n) + "n" 
                         "Solution total workload = " + str(self.t_work) + "n" 
                         "Solution total waiting time = " + str(self.t_wait) + "n" 
                         "Solution remaining demands = " + str(self.rd)

8.3 调参与实验

做实验啦，终于到了验证算bug法de性shu能liang的时刻了。

不同ACO参数对评价函数值，及其收敛速度的影响
不同QL参数对评价函数值，及其收敛速度的影响
不同CCP参数对评价函数值，及规划解的影响

ACO相关的参数
蚂蚁数量（一条路径跑多少次）
初始信息素量
挥发系数
信息素在概率转移函数中的权重
等待时长在概率转移函数中的权重
Q Learning相关的参数
学习速率
折扣系数
贪婪系数
等待时长在奖励中的权重
Chance Constrained Programming模型的参数
*需要recall的朋友请转第四章
护工老人相熟度增量
CCWT置信度 - α
CCO置信度 - β
每次任务允许的最大等待时长 - C
每个护工工作表的最大工时 - W

现在整个算法看起来是这个样子的。（省略了一部分之前重复的，免得冗长）

if __name__ == '__main__':

    """------Instance representations------"""
    ......

    """------Q Learning parameters------"""
    ql_learning_rate_para = 0.7
    ql_discount_para = 0.9
    ql_greedy_para = 0.1
    ql_waiting_para = -0.01
    ql_q_matrix = np.zeros((8, 3))
    ql_absorbing_state = [6, 7]

    """------ACO parameters------"""
    aco_alpha = 5
    aco_beta = 5
    aco_rho = 0.3
    aco_ant_number = 30
    aco_pheromone = 20
    aco_pheromone_matrix = np.ones((len(e_init_distance_matrix[0]), len(e_init_distance_matrix[0]))) * aco_pheromone

    """------CCP model parameters------"""
    ccp_acquaintance_increment = 0.01
    ccp_alpha = 1.29
    ccp_beta = 1.29
    ccp_waiting_limit = 15
    ccp_workload_limit = 480

    """---Solution Recording Variables---"""
    solution_final = Solution()

    """Figure data record"""
    axis_sub_evaluation_value_iter = []
    axis_evaluation_value_iter = []

    """---Start iteration---"""
    # Record the starting time of the training process
    start = time.time()
    iter = 0
    iter_max = 500
    print('Current Experimental Instance is ' + file_index)
    print('Instance Scale: ' + str(file_scale-1))
    print('Start training...')
    while iter < iter_max:
        ......
        axis_evaluation_value_iter.append(solution_final.ev)
        axis_sub_evaluation_value_iter.append(sub_solution.ev)

    # Record the ending time of the training process
    end = time.time()
    rt = (end - start) / 60
    # Create figure text for the solution
    solution_final.create_fig_text(rt)
    print(solution_final.fig_t)
    # Save detailed information into an excel file
    solution_final.get_solution_info()
    solution_final.save_solution_info("final solution.xls")
    # Plot the figures
    plot("iter - evaluation value", "evaluation value", "iteration",
         iter_max - 1, axis_evaluation_value_iter, solution_final.fig_t)
    plot("iter - sub evaluation value", "sub ev", "iteration",
         iter_max - 1, axis_sub_evaluation_value_iter, solution_final.fig_t)
    # Show the trained Q matrix on the console
    print(np.around(ql_q_matrix, decimals=2))

*为了画图，我们需要把每个episode的评估函数值记录下来，成为一个数组。所以在while的每个循环结束之前，我把想要记录的随着episode前进而变化的值，append进这些列表里面去。训练过程结束后（while loop ends），调用plot()方法，图就画出来了。

...
"""Figure data record"""
axis_sub_evaluation_value_iter =[] axis_evaluation_value_iter =[]
...
training...
...
plot("iter - evaluation value", "evaluation value", "iteration",
iter_max -1, axis_evaluation_value_iter, solution_final.fig_t)

*time.time()这个东西是记录系统当前时间的，是现实时间（单位是秒），就是windows系统右下角那个。如果用的是time.clock()，记录下来的是CPU跳动的次数。16.3. time - Time access and conversions - Python

*iter这个变量是从0开始算的，而循环语句用的是while iter < iter_max，所以调用plot()时候，x轴最大值要输入iter_max -1

接下来，把所有要测试的参数，都换成数组来表示。

然后就用for循环，来测试每一个参数组合。输出文件的命名根据for循环来改名，如此一来，就可以让计算机跑一晚上，自己睡一觉，明早查看结果了。orz...

由于这类算法的参数很多，排列组合起来需要做数不胜数的实验，所以调参的时候要根据算法的架构、模型的架构、评估函数的形式来确定大致在哪个范围里测试参数。对于我自己写的这个算法，个人认为以ACO→QL→CCP这个顺序来依次进行调参来得有效。

*附上每次训练完毕之后，测试Q矩阵的代码。主要不同点就在于把q_learning()方法做了小修改，把greedy参数改为0，然后删掉了更新Q矩阵的代码。这样就让agent单纯根据一个不变的Q矩阵来行动，看看给出的解如何。

if __name__ == '__main__':

    """------Instance representations------"""
    .......
    """Parameter settings"""
    # learningRate discount greedy workTimePara
    para_ql = [[0.1, 0.9, 0.5, -0.01],
               [0.5, 0.9, 0.5, -0.01],
               [0.9, 0.9, 0.5, -0.01]]
    # initialPheromone alpha beta evaporation
    para_aco = [[20, 1, 1, 0.1],
                [20, 1, 1, 0.5],
                [20, 5, 1, 0.1],
                [20, 5, 1, 0.5],
                [20, 1, 5, 0.1],
                [20, 1, 5, 0.5]]
    # confidence levels:        100%  95%   90%   80%   70%   60%   50%
    # inverse standard normal:  3.09  1.65, 1.29  0.85  0.52  0.26  0
    # acquaintance_increment waiting workload alpha beta  
    para_ccp = [[0.05, 10, 480, 1.29, 1.29],
                [0.05, 20, 480, 1.29, 1.29],
                [0.05, 30, 480, 1.29, 1.29],
                [0.05, 40, 480, 1.29, 1.29],
                [0.05, 50, 480, 1.29, 1.29]]
    # Experiment
    for ex in range(len(para_ql)):
        """------Q Learning parameters------"""
        ql_learning_rate_para = para_ql[ex][0]
        ql_discount_para = para_ql[ex][1]
        ql_greedy_para = para_ql[ex][2]
        ql_waiting_para = para_ql[ex][3]

        """------ACO parameters------"""
        ...
        """------CCP model parameters------"""
        ...
        ...
        while iter < iter_max:
            ...
            ...
            ...

        """Solution after training"""
        print("nStart testing...")
        trained_q_matrix = copy.deepcopy(ql_q_matrix)
        print("Trained Q matrix:")
        test_nurses = copy.deepcopy(e_nurses_skill)
        test_targets = copy.deepcopy(e_jobs)
        test_preference_matrix = copy.deepcopy(e_preference_matrix)
        test_pheromone_matrix = copy.deepcopy(aco_pheromone_matrix)
        test_solution = Solution()
        test_ql_results = QL_BWACO.q_learning_test(test_solution.nl, test_targets, test_nurses,
                                                   trained_q_matrix, 0, ql_absorbing_state,
                                                   # Aco para
                                                   ccp_acquaintance_increment, ccp_alpha, ccp_beta, ccp_waiting_limit,
                                                   ccp_workload_limit,
                                                   aco_alpha, aco_beta, aco_rho, aco_ant_number, test_pheromone_matrix,
                                                   e_walk_speed, test_preference_matrix, e_init_distance_matrix,
                                                   e_service_time_mean,
                                                   e_jobs[0])
        test_solution.calculate(test_ql_results, test_targets)
        test_solution.create_fig_text(0)
        print(test_solution.fig_t)
        test_solution.get_solution_info()
        test_solution.save_solution_info("test solution.xls")

8.4 算法部分总结

这个程序到现在就已经完结了，捋一捋思路：

用蚁群算法求解单人路径规划任务
用QLearning选择先后进行routing的护工
每个QL的episode输出一个护工的工作安排规划
根据这个规划，预测等待时长并评价其优劣性
Step 4的结果用来更新Q矩阵，并在每个episode结束时记录下经历过的最优解
经过很多episode之后，用q_learning_test()方法测试Q矩阵的训练结果

从算法的思想上来说，我考虑用强化学习和蚁群算法结合来求解一个多人路径规划问题，试图让计算机在trail-and-error的过程中去找到一个近似的最优解，并将经验量化成一个Q矩阵记录下来，使得以后在解同一个问题的时候，能够免去训练过程而快速给出一个合理的规划方案。这里主要存在三个问题：

这个模型其实是一个多目标规划模型，我用scalarisation的方法将两个目标（最大化需求满足量和最小化等待时长），将多目标规划转换成单目标规划，也就是普通的规划问题。但是scalarisation的合理性有待考量（也就是评估解的质量的函数性能有待考量）。
蚁群算法（ACO）是一种进化算法（Evolutionary Algorithm），这类算法以效率出名，但是其给出的解往往没有经过严格证明为Pareto Optimal的过程。虽然EA算法的应用很广，但研究多目标规划问题的学者们对EA算法的态度是有分歧的，一部分人认为其不够严谨（给出的解没有足够的说服力），另一部分人认为虽然没有严谨的证优理论支撑，但是解的质量在很多现实应用中是令人满意的。对于这个问题的讨论可以参考：Research Gate - Why_do_most_of_the_multi-objective_optimization_techniques_consider_scalarization 中 Ruhul_Sarker 与 Hermann_Schichl 两人的争论。
马尔科夫模型中对状态的定义其实很浅（shallow），虽然状态空间降维了，但是也损失了模型的精确度，导致算法不够generalised，一旦环境（案例）变化较大，就开始给出一些很不合理的解（因为Q矩阵能够表示的知识太有限了）。

整个项目的代码已经放在GitHub上了IanYangChina/QL-ACO，包括四个实验案例，两个算法流程图。日后会持续更新GitHub上的内容，欢迎发布issue和pull request :)。