优化算法--以Python实现（1）

最新推荐文章于 2024-08-22 21:36:40 发布

kewing

最新推荐文章于 2024-08-22 21:36:40 发布

阅读量1.4w

点赞数 3

分类专栏： Python 文章标签：算法 python 优化 domain 测试旅游

本文链接：https://blog.csdn.net/kewing/article/details/6160063

版权

本文探讨了优化算法，如随机搜索、退火算法、爬山法和遗传算法，结合Python实现。通过解决组团旅游问题，阐述了算法的运用。文章还介绍了Python中的数据结构、文件操作和函数用法，并提供了相关代码示例。最后，强调了优化算法的有效使用依赖于合理的成本函数和domain定义。

摘要由CSDN通过智能技术生成

此文中讨论优化算法，诸如随机搜索，退火算法，爬山法，遗传算法之类。参考了《集体智慧编程》。由于对Python不太熟，因此也讨论了下Python。。。对算法的讨论最好以实例/或问题的方式入手，故我们引入了组团旅游问题。

但你会看到，组团旅游什么的都是浮云，真正有意义的是从第四点开始的优化算法：随机搜索，退火算法，爬山法，遗传算法

组团旅游：

该问题的Java版本在神人vivizhyy的博文中：http://vivizhyy.javaeye.com/blog/643048

问题描述可参见该博文。

以下一至三均不是优化算法的一部分，但却是正确有效使用优化算法的前提和关键所在。如果你仅对优化算法感兴趣，请下拉至四。

一。先定义数据集和辅助函数：

# chr05 -- optimization.py import time import random import math # 人物 people =[ ('Seymour','BOS'), ('Franny','DAL'), ('Zooey','CAK'), ('Walt','MIA'), ('Buddy','ORD'), ('Les','OMA') ] # 目的地 destination = 'LGA' # 从航班数据中读入数据 flights = {} with open('schedule.txt') as file: for line in file: origin,dest,depart,arrive,price = line.strip().split(',') flights.setdefault((origin,dest), []) #添加航班详情 flights[(origin,dest)].append( (depart, arrive, int(price)) ) def getminutes(t): x = time.strptime( t, '%H:%M' ) return x[3]*60 + x[4]

1.关于文件打开：

在Python3中取消了file（）函数，取代之以open函数。上述with的open文件代码块是with的一个妙用。参见：http://woodpecker.org.cn/diveintopython3/files.html

2.line.strip()

line.strip()是Python中取出字符串首尾空格（或字符）的函数。与VB中的Trim,Ltrim,Rtrim类似，但strip不仅能去掉空格，还可以去掉其他字符。参见：http://blog.csdn.net/suofiya2008/archive/2010/05/19/5608309.aspx

3.flights实际上是一个以元组为键以元组为值的字典。

4.关于getminutes

5.打印字典

若要打印字典flights，应该用（想找一个直接print的函数妙用找到，貌似只能用for来辅助完成）

for item in flights.items(): print(item) #或者使用 for key,value in flights.items(): print( key,value ) #在Python3之前还可以使用 for item in flights.iteritems(): print(item) # 完整从航班数据中读入数据如下： with open('schedule.txt') as file: for line in file: origin,dest,depart,arrive,price = line.strip().split(',') flights.setdefault((origin,dest), []) #添加航班详情 flights[(origin,dest)].append( (depart, arrive, int(price)) ) for item in flights.items(): print(item)

在Python3以前可以用

#for item in flights.iteritems(): # print(item)

这在Python3中是不可行的。可参见：http://www.sciencenet.cn/blog/user_content.aspx?id=372718以及：http://topic.csdn.net/u/20090721/13/91bf1e47-3d94-4b40-a18b-9406ac6985f3.html

二、将flights中的字典数据结构转换为表格形式。

# 航班 → 表格 def printschedule(r): for idx in range( int(len(r)/2) ): name = people[idx][0] origin = people[idx][1] out = flights[(origin,destination)][int(r[2*idx])] ret = flights[(destination,origin)][int(r[2*idx+1])] print( '%10s%10s %5s-%5s $%3s %5s-%5s $%3s' % (name, origin, out[0], out[1], out[2], ret[0], ret[1], ret[2]) )

代码中的for idx in range( int(len(r)/2) ):一句中，如果不使用int转换len/2的返回值则会出错，因在Python3中除法返回float,因此需要进行int转换。也因此不会再出现：http://apps.hi.baidu.com/share/detail/23147749中的问题。

另，参见：http://hi.baidu.com/gofight/blog/item/a73e3e1fcf9e74fe1ad576e8.html

另，关于print的格式化，参见：http://blogold.chinaunix.net/u2/84280/showart_2068008.html，print很像C中的printf

三、成本计算：

此题中的成本计算函数如下：

# 成本计算 def schedulecost(sol): totalprice = 0 latestarrival = 0 earliestdep = 24*60 for idx in range( int(len(sol)/2) ): # 得到往返航班信息 origin = people[idx][1] outbound = flights[(origin,destination)][int(sol[2*idx])] returnf = flights[(destination,origin)][int(sol[2*idx+1])] # 总价格为往返航班价格之和 totalprice += (outbound[2] + returnf[2]) # 记录最晚到达时间和最早离开时间 latestarrival = max(latestarrival, getminutes(outbound[1])) earliestdep = min(earliestdep, getminutes(returnf[0])) 计算成本 totalwait = 0 for idx in range( int(len(sol)/2) ): origin = people[idx][1] outbound = flights[(origin,destination)][int(sol[2*idx])] returnf = flights[(destination,origin)][int(sol[2*idx+1])] totalwait += latestarrival - getminutes(outbound[1]) totalwait += getminutes(returnf[0]) - earliestdep # 若要考虑租车费用，则: if latestarrival > earliestdep: totalprice += 50 return totalprice + totalwait

算法可改进为以下形式：

# 成本计算2 def schedulecost_x(sol): totalprice = 0 latestarrival = 0 earliestdep = 24*60 temp_out = 0 temp_ret = 0 for idx in range( int(len(sol)/2) ): # 得到往返航班信息 origin = people[idx][1] outbound = flights[(origin,destination)][int(sol[2*idx])] returnf = flights[(destination,origin)][int(sol[2*idx+1])] # 总价格为往返航班价格之和 totalprice += (outbound[2] + returnf[2]) # 记录最晚到达时间和最早离开时间 temp_val = getminutes(outbound[1]) latestarrival = max(latestarrival, temp_val) temp_out += temp_val temp_val = getminutes(returnf[0]) earliestdep = min(earliestdep, temp_val) temp_ret += temp_val #计算成本 totalwait = 0 totalwait += int(len(sol)/2)*latestarrival - temp_out totalwait += temp_ret - int(len(sol)/2)*earliestdep # 若要考虑租车费用，则: if latestarrival > earliestdep: totalprice += 50 return totalprice + totalwait

但该算法并不是很直观。

成本函数的确定是优化算法有效使用的一个相当重要的方面。

实际上，当优化算法写好以后，定义合理的成本函数以及domain即成为了关键，对优化函数的使用并不是拿来用即可，在使用之前得却确保有正确的辅助函数。详情见本博《优化算法--Python实现（2）》。

开始的测试我们使用的是人为的随机指定一个往返时刻列表，以此来算花费；

该优化算法的主要用途即是：通过搜索，找到一个合理的序列，该序列能够使得花费最低。

以下开始便是各种有效的搜索算法：

（好吧，优化算法来了。。。请注意以下优化函数中domain的入内参数domain，在不同的使用优化函数的场合是不同，domain和成本函数是优化函数有效的关键。）

四：随机搜索算法

#---------------------------------------------------------------- # 随机搜索 def randomoptimize(domain, costfunc): best = 999999999 bestr = None # 随机进行1000次猜测 for i in range(1000): # 创建一个随机群 r = [ random.randint(domain[idx][0], domain[idx][1]) for idx in range(len(domain)) ] # 计算成本 cost = costfunc(r) # 与目前为止的最优解比较 if cost < best: best = cost bestr = r return bestr

for i in range(1000):的用法是Python中for的一个很好的用法，甚至比VB中的for each更为好用，因在VB6中，for each仅对variant变量或对象集合有效。

另，domain = [(0,9)]*(len(people)*2)实际上是定义了一个长度为len(people)*2的元组列表，每个元组可以是重复的。列表元素可以是重复的。

事实证明，增加随机搜索的调用次数，或者增加随机搜索的尝试次数（randomoptimize中for的循环次数），并不能显著的得到更好的序列。在我的机子上，最好的结果是31XX，在增加循环次数后，并没有得到比之更低的结果。

五：爬山法

该方法充分利用了已发现的解序列，基于最有序列很可能接近已发现的解序列，且，已发现的解序列意味着，在它之前的所有序列中，它是最优的，那么基于该解序列的进化应该能得到更为优异的序列，当不能产生更为优秀的序列，则进化完毕。

该算法从一随机序列开始，在一次循环所得到的序列之上进行不断的进化，直到找到应该更为优秀的，或最为优秀的序列。如下：

#---------------------------------------------------------------- # 爬山法 def hillclimb(domain, costfunc): # 创建一个随机群 sol = [ random.randint(domain[idx][0], domain[idx][1]) for idx in range(len(domain)) ] # main loop while True: # 创建相邻近的列表 neighbors = [] for j in range( len(domain) ): # 在每个方向上相对于原值偏离一点 if sol[j] > domain[j][0]: # print(sol[j],domain[j][0]) neighbors.append(sol