含分式的非线性规划求解: 基于Outer Approximation的Branch-and-cut 算法及其Java实现
问题介绍
在混合整数规划求解中,如果目标函数和约束都为线性的情况下,求解器可以直接求解得到精确解。虽然目前求解器也可以完成一些特殊非线性目标函数和约束的求解,如MIQP, MIQCP, MIQCQP, MISOCP等(详情参见运小筹第60期推文),但针对一般的非线性问题,如目标函数为以下的分式形式,就无法利用求解器直接求解。
m
i
n
∑
i
∈
I
U
1
U
2
+
∑
j
∈
J
x
j
(1)
min \sum_{i \in I} \frac{U_1 }{U_2+\sum_{j\in J}x_j}\tag{1}
mini∈I∑U2+∑j∈JxjU1(1)
针对目标函数为凸函数的情况,我们采取Outer-Approximation(下称OA)的方法进行求解,OA由Duran和Grossmann在1986年提出
[
1
]
^{[1]}
[1],后面被不断发展改进。OA的基本思想就是利用松弛问题的解生成切线,再加入松弛主问题中,直至找到最优解,利用多条切线去刻画原来的凸函数曲线。近年来的研究中,学者们通常将OA与Branch-and-cut结合使用,利用OA生成cut加入算法中。
Outer Approximation介绍
下面将借助一个具体模型介绍OA的求解过程。
m
i
n
∑
i
∈
I
U
1
U
2
+
∑
j
∈
J
ω
i
j
x
j
(2)
min \sum_{i \in I} \frac{U_1 }{U_2+\sum_{j\in J}\omega_{ij}x_j}\tag{2}
mini∈I∑U2+∑j∈JωijxjU1(2)
∑
j
∈
J
x
j
≤
r
(3)
\sum_{j\in J} x_j \le r \tag{3}
j∈J∑xj≤r(3)
x
j
∈
{
0
,
1
}
,
∀
j
∈
J
(4)
x_j \in \{0,1\}, \forall j\in J\tag{4}
xj∈{0,1},∀j∈J(4)
针对原问题,我们进行一些变换,引入连续变量
θ
\theta
θ,将复杂的目标函数放在约束条件中
m
i
n
θ
(5)
min \theta\tag{5}
minθ(5)
θ
≥
∑
i
∈
I
U
1
U
2
+
∑
j
∈
J
ω
i
j
x
j
(6)
\theta \ge \sum_{i \in I} \frac{U_1 }{U_2+\sum_{j\in J}\omega_{ij}x_j} \tag{6}
θ≥i∈I∑U2+∑j∈JωijxjU1(6)
∑
j
∈
J
x
j
≤
r
(7)
\sum_{j\in J} x_j \le r \tag{7}
j∈J∑xj≤r(7)
x
j
∈
{
0
,
1
}
,
∀
j
∈
J
(8)
x_j \in \{0,1\}, \forall j\in J \tag{8}
xj∈{0,1},∀j∈J(8)
对约束(6),它在
x
x
x上是凸函数,所以我们在
x
∗
x^*
x∗处进行一阶展开得到
θ
≥
F
(
x
∗
)
+
∑
j
∈
J
F
j
′
(
x
∗
)
(
x
j
−
x
j
∗
)
(9)
\theta \ge F(x^*)+\sum_{j \in J} F^{'}_j(x^*)(x_j-x_j^*) \tag{9}
θ≥F(x∗)+j∈J∑Fj′(x∗)(xj−xj∗)(9)
其中
F
(
x
)
=
∑
i
∈
I
U
1
U
2
+
∑
j
∈
J
ω
i
j
x
j
F(x)=\sum_{i \in I} \frac{U_1 }{U_2+\sum_{j\in J}\omega_{ij}x_j}
F(x)=∑i∈IU2+∑j∈JωijxjU1
F
j
′
(
x
)
=
d
F
(
x
)
d
x
j
=
−
∑
i
∈
I
U
1
ω
i
j
(
U
2
+
∑
k
∈
J
w
i
k
x
k
)
2
(10)
F^{'}_j(x)=\frac{dF(x)}{dx_j }=-\sum_{i \in I}\frac{U_1\omega_{ij}}{(U_2 +\sum_{k\in J}w_{ik}x_k)^2} \tag{10}
Fj′(x)=dxjdF(x)=−i∈I∑(U2+∑k∈Jwikxk)2U1ωij(10)
所以约束(6)变换为
θ
≥
∑
i
∈
I
U
1
U
2
+
∑
j
∈
J
ω
i
j
x
j
∗
−
∑
j
∈
J
∑
i
∈
I
U
1
ω
i
j
(
U
2
+
∑
k
∈
J
w
i
k
x
k
∗
)
2
(
x
j
−
x
j
∗
)
(11)
\theta \ge \sum_{i \in I} \frac{U_1 }{U_2+\sum_{j\in J}\omega_{ij}x_j^*}-\sum_{j\in J}\sum_{i \in I}\frac{U_1\omega_{ij}}{(U_2 +\sum_{k\in J}w_{ik}x_k^*)^2}(x_j-x_j^*) \tag{11}
θ≥i∈I∑U2+∑j∈Jωijxj∗U1−j∈J∑i∈I∑(U2+∑k∈Jwikxk∗)2U1ωij(xj−xj∗)(11)
式(11)就被称为OA cut
java调用cplex代码
Branch-and-cut实现: 为了求解上述模型,我们借助cplex求解器中的callback函数,当当前LP松弛的解决方案
x
∗
x^*
x∗被证明是整数时,我们检查是否违反了约束(11),如果违背,则将它们添加到当前LP中,OA cut是全局有效的,通过在MILP求解器的lazy-cut callback实现的。下面是Java调用cplex实现OA cut的Branch-and-cut算法代码。
首先是Cut类,用于存储OA生成的cuts
import java.util.ArrayList;
import ilog.concert.IloNumExpr;
public class Cuts {
ArrayList<IloNumExpr> lhs;
ArrayList<Double> rhs;
/**
* @return the lhs
*/
public ArrayList<IloNumExpr> getLhs() {
return lhs;
}
/**
* @param lhs the lhs to set
*/
public void setLhs(ArrayList<IloNumExpr> lhs) {
this.lhs = lhs;
}
/**
* @return the rhs
*/
public ArrayList<Double> getRhs() {
return rhs;
}
/**
* @param rhs the rhs to set
*/
public void setRhs(ArrayList<Double> rhs) {
this.rhs = rhs;
}
}
下面是主要的代码
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import Cuts;
import ilog.concert.*;
import ilog.cplex.IloCplex;
import ilog.cplex.IloCplex.LazyConstraintCallback;
public class OA {
//calculate the gradient
public static double CalculateGradient(double[] x,double[][] w,int j,int num,double U1,double U2) throws IloException {
double gradient_j=0;
for(int i=0;i<num;i++) {
double b=0;
for(int k=0;k<x.length;k++) {
b+=w[i][k]*x[k];
}
gradient_j+=-U1*w[i][j]/(b+U2)/(b+U2);
}
return gradient_j;
}
//calculate the value of the objective function
public static double calF(double[] x,double[][] w,int num,double U1,double U2) {
double F=0;
for(int i=0;i<num;i++) {
double b=0;
for(int k=0;k<x.length;k++) {
b+=w[i][k]*x[k];
}
F+= U1/(U2+b);
}
return F;
}
//OA cut
public static Cuts makecuts(IloNumVar[] x, double[] xSol,double[][] w,int num,double U1,double U2, IloModeler ilcplex,IloNumVar theta,double theta0) throws IloException {
Cuts cuts=new Cuts();
ArrayList<IloNumExpr> cutLhs = new ArrayList<IloNumExpr>();
ArrayList<Double> cutRhs = new ArrayList<Double>();
IloLinearNumExpr lhsExpr=ilcplex.linearNumExpr();
double rhsExpr=0;
double lhs = 0, rhs = 0;
lhsExpr.addTerm(1, theta);
rhsExpr+=calF(xSol,w,num,U1,U2);
for(int j=0;j<xSol.length;j++) {
lhsExpr.addTerm(-CalculateGradient(xSol,w,j,num,U1,U2), x[j]);
rhsExpr-=CalculateGradient(xSol,w,j,num,U1,U2)*xSol[j];
}
lhs=theta0;
rhs=calF(xSol,w,num,U1,U2);
if(lhs<rhs) {
cutLhs.add(lhsExpr);
cutRhs.add(rhsExpr);
}
cuts.setLhs(cutLhs);
cuts.setRhs(cutRhs);
return cuts;
}
public static void buildModel(double[][] w,int num1,int num2,double U1,double U2) throws IloException {
IloCplex ilcplex =new IloCplex();
IloIntVar[] x=new IloIntVar[num1];
int r=3;
for(int i=0;i<num1;i++) {
x[i]=ilcplex.boolVar();
}
//objective
IloNumVar theta=ilcplex.numVar(0, Double.MAX_VALUE);
ilcplex.addMinimize(theta);
//constraints
IloLinearNumExpr left=ilcplex.linearNumExpr();
for(int i=0;i<num1;i++) {
left.addTerm(1, x[i]);
}
ilcplex.addEq(left, r);
ilcplex.use(new LazyCallback(x,ilcplex,theta,w,num2,U1,U2));
if(ilcplex.solve()) {
double[] xVal = ilcplex.getValues(x);
System.out.println("The objective value is:" + ilcplex.getObjValue());
System.out.println(Arrays.toString(xVal));
}
}
//LazyCallback
public static class LazyCallback extends LazyConstraintCallback{
Cuts cut;
ArrayList<IloNumExpr> cutLhs;
ArrayList<Double> cutRhs;
IloIntVar[] x;
IloCplex ilcplex;
IloNumVar theta;
double[][] w;
int num;
double U1,U2;
LazyCallback(IloIntVar[] x0,IloCplex ilcplex0,IloNumVar theta0,double[][]w0,int num0,double U10,double U20){
x=x0;
ilcplex=ilcplex0;
theta=theta0;
w=w0;
num=num0;
U1=U10;
U2=U20;
}
public void main() throws IloException{
double[] xSol =getValues(x);
double theta0=getValue(theta);
cut = makecuts(x, xSol,w,num,U1,U2,ilcplex,theta,theta0);
cutLhs = cut.getLhs();
cutRhs= cut.getRhs();
for(int i = 0; i< cutLhs.size(); i++) {
addLocal(ilcplex.ge(ilcplex.diff(cutLhs.get(i),cutRhs.get(i)), 0));
}
}
}
public static void main(String[] args) throws IOException, IloException {
double U1=5;
double U2=3;
int num1=5;
int num2=10;
double[][] w= {{3,2,5,7,2},{6,2,4,9,3},{3,1,6,5,2},{4,1,3,2,2},{2,8,3,6,2},{7,5,4,9,3},{3,7,5,2,6},{6,4,2,4,6},{4,6,2,8,1},{3,4,8,5,3}};
buildModel(w,num1,num2,U1,U2);
下面为代码的求解结果:
参考文献
[1] Duran M A , Grossmann I E . An outer-approximation algorithm for a class of mixed-integer nonlinear programs[J]. Mathematical Programming, 1986, 36(3):307-339.