💥💥💞💞欢迎来到本博客❤️❤️💥💥
🏆博主优势:🌞🌞🌞博客内容尽量做到思维缜密,逻辑清晰,为了方便读者。
⛳️座右铭:行百里者,半于九十。
📋📋📋本文目录如下:🎁🎁🎁
💥1 概述
在基因组序列组装算法中,一个最基本的问题就是如何合适的选择上下游的短序列用于组装成一个长序列。当单独从一个种子序列进行扩展的时候,大量的重复的区域将会导致非常多的扩展的候选,从而导致序列组装问题非常的复杂。目前通用的方法就是选择一个基于短序列(双端序列)之间的重叠信息然后进行组装的。然而当所组装的基因组序列是非常高重复的复杂数据的时候,这种方法将面临巨大的挑战,尤其是序列数据中还包含有错误、高重复的序列以及不均衡的测序深度导致基因组中某些区域只有少量的序列覆盖或者大量的序列覆盖。所有的这些原因导致了现在的组装程序得不到最完美的组装基因组数据。本文通过建立一种新型启发式算法,将FA的启发式和DSA的启发式相结合以求解基因序列组装。
📚2 运行结果
部分代码:
Gamma=50; %%light absorption (0-100)
p1=0.3*rand;
p2=0.3*rand;
MaxGen=10;
PopSize=10;
CompRun=10;
fireDiffDetail=fopen('fireDiffDetail','w');
fireDiffSim=fopen('fireDiffSim.txt','w');
fireDiffResults=fopen('fireDiffResults.txt','w');
ScoringMatrix=csvread('x60189_4.csv');
Dimension=size(ScoringMatrix,1);
tic;
for i=1:CompRun
fprintf('\nNo of Run %d\n',i);
fprintf(fireDiffDetail,'\n====================');
fprintf(fireDiffDetail,'\nNo of Run %d\n',i);
fprintf(fireDiffDetail,'====================\n\n');
fprintf(fireDiffSim,'\n====================');
fprintf(fireDiffSim,'\nNo of Run %d\n',i);
fprintf(fireDiffSim,'====================\n\n');
%% Initial population
Chrom=InitPop(PopSize,Dimension);
ChromList=Chrom;
%% Initial solution with scoring value
%disp('An initial population of random solution: ');
InitialSolution=OutputSolution(Chrom(1,:));
InitialScoringValue=CalculateScore(ScoringMatrix,Chrom(1,:));
%% Optimization
ObjV=CalculateScore(ScoringMatrix,Chrom);
[preObjV,ObjVNo]=max(ObjV);
fprintf(fireDiffSim,' Best Solution\n\n');
for Gen=1:MaxGen
% fprintf('\nIteration %d\n',Gen);
fprintf(fireDiffDetail,' Gen preObjV CurrentBestSolution TopCurrentBestSolution');
%% Calculate fitness
% ObjV=CalculateScore(ScoringMatrix,Chrom)
% [preObjV,ObjVNo]=max(ObjV)
for i=1:PopSize
K=0;
solution=zeros(PopSize,Dimension);
for j=1:i-1
Dij=Distance(Chrom(i,:),Chrom(j,:));
if Dij<=Gamma
K=K+1;
solution(K,:)=Chrom(j,:);
end
end
for j=i+1:PopSize
Dij=Distance(Chrom(i,:),Chrom(j,:));
if Dij<=Gamma
K=K+1;
solution(K,:)=Chrom(j,:);
end
end
if K==0
solution1=zeros(1,Dimension);
else
solution1=ones(K,Dimension);
end
for v=1:K
solution1(v,:)=solution(v,:);
end
if K~=0
% fprintf('\nPopulation %d\n',i);
fprintf(fireDiffDetail,'\nPopulation %d\n',i);
ObjV1=CalculateScore(ScoringMatrix,solution1);
[maxObjV1,ObjV1No]=max(ObjV1);
maxChrom=solution1(ObjV1No,:);
% ObjValue=CalculateScore(ScoringMatrix,Chrom(i,:))
if preObjV<maxObjV1
% fprintf('\nFA operation %d\n');
fprintf(fireDiffDetail,'Firefly Algorithm');
[Chrom(i,:),maxChrom]=ChangePlace(Chrom(i,:),maxChrom);
CurrentBestSolution=maxObjV1;
preObjV=CurrentBestSolution;
else
Donor=Chrom(randperm(PopSize),:);
DonorScore=CalculateScore(ScoringMatrix,Donor);
map=zeros(PopSize,Dimension);
if rand<rand
if rand<p1
for m=1:PopSize
map(m,:)=rand(1,Dimension)<rand;
end
else
for m=1:PopSize
map(i,randi(Dimension))=1;
end
end
else
for m=1:PopSize
map(m,randi(Dimension,1,ceil(p2*Dimension)))=1;
end
end
Scale=4*randg;
StopOver=PopSize+(Scale.*map).*(Donor-PopSize);
StopOver=UpdateStopOver(StopOver,Dimension);
StopOverObjV=CalculateScore(ScoringMatrix,StopOver);
[MaxStopOverObjV,StopOverObjVNo]=max(StopOverObjV);
if MaxStopOverObjV>preObjV
fprintf(fireDiffDetail,'DS Algorithm ');
% fprintf('\nDSA operation %d\n');
CurrentBestSolution=MaxStopOverObjV;
preObjV=CurrentBestSolution;
else
fprintf(fireDiffDetail,'No Change ');
% fprintf('\nNo Change %d\n');
CurrentBestSolution=preObjV;
end
end
ShowNewObjValue(1,i)=CurrentBestSolution;
TopCurrentBestSolution=max(ShowNewObjValue);
fprintf(fireDiffDetail,'%5d ---> %5d ---> %5d ---> %5d\n',Gen,preObjV,CurrentBestSolution,TopCurrentBestSolution);
end
end
fprintf(fireDiffDetail,'--------------------------------------------------------------------------------------------------');
BestSolution=max(ShowNewObjValue);
globalmax(Gen)=BestSolution;
iteration(Gen)=Gen;
fprintf(fireDiffSim,'Iteration %5d ---> %5d\n',Gen,BestSolution);
clear ShowNewObjValue;
end
plot(iteration,globalmax)
xlabel('Iteration');
ylabel('Best Score');
hold on;
%ShowChromList=Chrom
%ScoreChromList=CalculateScore(ScoringMatrix,ShowChromList)
%OptimalSolution=OutputSolution(Chrom())
TopBestSolution=max(BestSolution);
fprintf(fireDiffSim,'\nTopBestSolution ---> %5d\n',TopBestSolution);
fprintf(fireDiffResults,'%5d\n',TopBestSolution);
end
fclose(fireDiffDetail);
fclose(fireDiffSim);
fclose(fireDiffResults);
fireDiffResults=fopen('fireDiffResults.txt','r');
data=cell2mat(textscan(fireDiffResults,'%5d'));
data=dlmread('firediffresults.txt')
highestScore=max(data);
lowestScore=min(data);
avg=mean(data);
stdDev=std(data);
disp(['Best Scoring Value = ' num2str(highestScore)]);
disp(['Worst Scoring Value = ' num2str(lowestScore)]);
disp(['Average = ' num2str(avg)]);
disp(['Standard Deviation = ' num2str(stdDev)]);
fclose(fireDiffResults);
toc;
🎉3 参考文献
部分理论来源于网络,如有侵权请联系删除。
[1]王迎庆.模糊图的启发式搜索算法FA[J].计算机工程与应用,1991(04):12-17.
[2]徐魁. 高效的分布式大规模基因组序列组装[D].天津工业大学,2016.
[3]罗家祥,唐立新,胡跃明.带振荡策略的启发式算法求解一类新型分配问题[J].系统工程理论与实践,2009,29(01):111-117.