看到uncle bob一个令人拍案叫绝的思路,忍不住要分享一下:所有的嵌套循环都可以简化成单层循环。
1.问题引入
常规传统的素因子分解代码:
public List<Integer> factorsOf(int n) {
ArrayList<Integer> factors = new ArrayList<>();
for (int d = 2; n > 1; d++)
for (; n % d == 0; n /= d)
factors.add(d);
return factors;
}
双层嵌套循环。
但Uncle bob在Clojure社区中看到另一种递归的形式(懒得看可以跳到下面Java翻译版):
(defn prime-factors [n]
(loop [n n d 2 factors []]
(if (> n 1)
(if (zero? (mod n d))
(recur (/ n d) d (conj factors d))
(recur n (inc d) factors))
factors)))
Java翻译版:
private List<Integer> factorsOf(int n) {
return factorsOf(n, 2, new ArrayList<Integer>());
}
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
if (n>1) {
if (n%d == 0) {
factors.add(d);
return factorsOf(n/d, d, factors);
} else {
return factorsOf(n, d+1, factors);
}
}
return factors;
}
或许到这里还看不出来啥,但既然它是尾递归,而尾递归可以和循环自由转换。那么如果将尾递归用循环替代:
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
while (true) {
if (n > 1) {
if (n % d == 0) {
factors.add(d);
n /= d;
} else {
d++;
}
} else
return factors;
}
}
诶?循环突然就少了一层耶。
2.本质分析
首先要看双层循环中内层的循环。其控制条件是【n%d==0是否成立】,因此只要能在外层循环中引入这个控制条件的判断,那么就可以将内层循环替代掉了。
按照这个思路,我们可以考虑将所有(内层和外层)的控制条件抽取成额外变量字段,那么这段程序就变成了这样:
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
while (true) {
boolean factorsRemain = n > 1;
boolean currentDivisorIsFactor = n % d == 0;
if (factorsRemain) {
if (currentDivisorIsFactor) {
factors.add(d);
n /= d;
} else {
d++;
}
} else
return factors;
}
}
原先双重循环的控制条件转换成了while(true)+双重if来控制。
如果感觉双重if判断比较难受,也可以通过两个Boolean变量的组合判断将双重if改成单层:
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
while (true) {
boolean factorsRemain = n > 1;
boolean currentDivisorIsFactor = n % d == 0;
if (factorsRemain && currentDivisorIsFactor) {
factors.add(d);
n /= d;
}
if (factorsRemain && !currentDivisorIsFactor)
d++;
if (!factorsRemain)
return factors;
}
}
而改成单层后组合条件可读性好像有点儿低,因此提取成:
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
while (true) {
boolean factorsRemain = n > 1;
boolean currentDivisorIsFactor = n % d == 0;
boolean factorOutCurrentDivisor = factorsRemain &&
currentDivisorIsFactor;
boolean tryNextDivisor = factorsRemain && !currentDivisorIsFactor;
boolean allDone = !factorsRemain;
if (factorOutCurrentDivisor) {
factors.add(d);
n /= d;
}
if (tryNextDivisor) {
d++;
}
if (allDone)
return factors;
}
}
为了便于进一步说明,我们将控制条件提取成枚举值:
private enum State {Starting, Factoring, Searching, Done}
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
State state = State.Starting;
while (true) {
boolean factorsRemain = n > 1;
boolean currentDivisorIsFactor = n % d == 0;
if (factorsRemain && currentDivisorIsFactor)
state = State.Factoring;
if (factorsRemain && !currentDivisorIsFactor)
state = State.Searching;
if (!factorsRemain)
state = State.Done;
switch (state) {
case Factoring:
factors.add(d);
n /= d;
break;
case Searching:
d++;
break;
case Done:
return factors;
}
}
}
这时发现,下一轮循环的状态是能够推断的:
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
State state = State.Starting;
while (true) {
switch (state) {
case Starting:
if (n == 1)
state = State.Done;
else if (n % d == 0)
state = State.Factoring;
else
state = State.Searching;
break;
case Factoring:
factors.add(d);
n /= d;
if (n == 1)
state = State.Done;
else if (n % d != 0)
state = State.Searching;
break;
case Searching:
d++;
if (n == 1)
state = State.Done;
else if (n % d == 0)
state = State.Factoring;
break;
case Done:
return factors;
}
}
}
得出这个结论有啥用呢,我先将其简化一下:
private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
State state = State.Starting;
while (true) {
switch (state) {
case Starting:
break;
case Factoring:
factors.add(d);
n /= d;
break;
case Searching:
d++;
break;
case Done:
return factors;
}
if (n == 1)
state = State.Done;
else if (n % d == 0)
state = State.Factoring;
else
state = State.Searching;
}
}
清晰多了,那么我摆上结论:
原始的双层嵌套循环实际上可以用摩尔有限状态机表示,而每次循环其实就算各个状态之间的跳转。
这种有限状态机之间的状态跳转也正是阿兰图灵早在1936年的论文中所展望的。Charles Petzold的著作《The Annotated Turing》中有详细介绍。
3.拍案叫绝的原因
感觉很精妙因为写了好多年代码,但就从未能跟状态机联想到一块儿。Uncle bob对此也谈了他的看法:因为Java的for循环中可以保存和改变判别条件的状态。正因如此,追求变量不可变的Clojure版本拿出来,他才第一时间联想到这些。这么看的话,函数式编程更加贴近于摩尔有限状态机。
Uncle Bob的另一个想法是,既然最外层的循环是while(true),那么或许可以从这一点出发再搞出一种新语言,把最外的循环这层也省去,就摆个状态机让他运行,没有for, while, if, else, goto, 就只有switch,程序根据switch在状态机(FSM)的各个状态节点之间跳转,直到被告知停止为止。