The key data structure is Condition FP Tree - a Trie with each path as a frequency-sorted path.

1. We count frequency of each item, and construct such a conditional FP tree. At the same time, we keep a list of all leaf nodes

2. For each leaf node (another item), we have several paths, and we generate a conditional FP-tree out of them - this is under the condition of that item(s)

3. We recursively call #2 on each item\generated conditional FP-tree

Please note that the code in the book has some defects. I fixed it as below:

```
# Tree Node
#
class TreeNode:
def __init__(self, nameValue, numOccur, parentNode):
self.name = nameValue
self.count= numOccur
self.nodeLink = None # link similar nodes
self.parent = parentNode
self.chidren = {}
def inc(self, numOccur):
self.count += numOccur
def disp(self, ind = 1): # DFS to print tree
print (' ' * ind, self.name, ' ', self.count)
for child in self.chidren.values():
child.disp(ind + 1)
'''
======= FP-Tree Construction (like Trie) =======
'''
def createTree(dataSet, minSup = 1): # dataSet is {}
# Pass 1: Count frequency
headerTable = {}
for trans in dataSet:
for item in trans:
headerTable[item] = headerTable.get(item, 0) + dataSet[trans]
# Remove unqualified items
keysToDel = []
for k in headerTable.keys():
if headerTable[k] < minSup:
keysToDel.append(k)
for k in keysToDel:
headerTable.pop(k, None)
freqItemSet = set(headerTable.keys())
if len(freqItemSet) == 0: return None, None
# Add link field to headerTable and init to None
for k in headerTable:
headerTable[k] = [headerTable[k], None] # frequency, link to 1st item
retTree = TreeNode('Null', 1, None)
# Pass 2
for tranSet, count in dataSet.items():
localD = {}
for item in tranSet:
if item in freqItemSet:
localD[item] = headerTable[item][0] # frequent
if len(localD) > 0:
# sort by frequent - highest come first
st = sorted(localD.items(), key=lambda p: p[1], reverse=True)
orderedItems = [v[0] for v in st]
updateTree(orderedItems, retTree, headerTable, count)
return retTree, headerTable
def updateTree(items, inTree, headerTable, count):
# Iterative
retTree = inTree
for i in range(len(items)):
if items[i] in inTree.chidren:
inTree.chidren[items[i]].inc(count)
else:
inTree.chidren[items[i]] = TreeNode(items[i], count, inTree)
# Append the Linked List in headerTable
if headerTable[items[i]][1] == None:
headerTable[items[i]][1] = inTree.chidren[items[i]]
else:
updateHeader(headerTable[items[i]][1], inTree.chidren[items[i]])
inTree = inTree.chidren[items[i]]
inTree = retTree # return
def updateHeader(nodeToTest, targetNode): # like a linked-list of similar items
while(nodeToTest.nodeLink != None): # go to the end of the linked-list
nodeToTest = nodeToTest.nodeLink
nodeToTest.nodeLink = targetNode
'''
======= Creating conditional FP trees =======
'''
def ascendTree(leafNode, prefixPath): # bottom up to root
if leafNode.parent != None:
prefixPath.append(leafNode.name)
ascendTree(leafNode.parent, prefixPath)
def findPrefixPath(treeNode):
condPats = {}
while treeNode != None: # do ascending for each instance of the same type
prefixPath = []
ascendTree(treeNode, prefixPath)
if len(prefixPath) > 1:
condPats[frozenset(prefixPath[1:])] = treeNode.count
treeNode = treeNode.nodeLink
return condPats
'''
======= Mining =======
'''
def mineTree(headerTable, minSup, preFix, freqItemList, level = 0):
# start from lowest frequent item
bigL = [v[0] for v in sorted(headerTable.items(), key = lambda p: p[1][0])]
# Based on some existing CP-tree - that is, some stat tree under some condition like p&q
for basePat in bigL:
newFreqSet = preFix.copy()
newFreqSet.add(basePat)
freqItemList.append((newFreqSet, headerTable[basePat][0])) # return: freqSet - its occurence
condPattBases = findPrefixPath(headerTable[basePat][1])
myCondTree, myHead = createTree(condPattBases, minSup)
if myHead != None:
mineTree(myHead, minSup, newFreqSet, freqItemList, level + 1)
```

作者：saintony 发表于 2016/03/08 06:53:11 原文链接 https://blog.csdn.net/saintony/article/details/50824208

阅读：453

]]>
```
# Load Data
def loadDataSet(path):
return [[1, 3, 4],
[2, 3, 5],
[1, 2, 3, 5],
[2, 5]]
'''
======== Frequent Set Searching ========
'''
# Create size1 sets
def createC1(dataSet):
C1 = []
# TODO: list to set maybe good enough
for transaction in dataSet:
for item in transaction:
if not [item] in C1:
C1.append([item])
C1.sort()
return map(frozenset, C1)
# Pruning out all set with support < minSupport
# D - dataset
# Ck - candidate sets
# minSupport - threshold
def scanD(D, Ck, minSupport):
ssCnt = {}
for tid in D:
for can in Ck:
if can.issubset(tid):
if not can in ssCnt: ssCnt[can] = 1
else: ssCnt[can] += 1
numItems = float(len(D))
retList = []
supportData = {}
# Measure support and prune
for key in ssCnt:
support = ssCnt[key] / numItems
if support >= minSupport:
retList.insert(0, key)
supportData[key] = support
return retList, supportData
def aprioriGen(Lk, k): # creates Ck
retList = []
lenLk = len(Lk)
for i in range(lenLk):
for j in range(i + 1, lenLk):
L1 = list(Lk[i])[:k-2] # [0,1] | [0,2] -> [0,1,2]
L2 = list(Lk[j])[:k-2]
if L1 == L2:
retList.append(Lk[i] | Lk[j])
return retList
def apriori(dataSet, minSupport = 0.5):
# start from size 1
C1 = list(createC1(dataSet))
D = list(map(set, dataSet))
L1, supportData = scanD(D, C1, minSupport)
#
L = [L1]
k = 2
while(len(L[k-2]) > 0):
print ('=Debug= Apriori Size of Last Level', len(L[k-2]))
Ck = aprioriGen(L[k-2], k)
Lk, supK = scanD(D, Ck, minSupport)
supportData.update(supK)
L.append(Lk)
k += 1
return L, supportData
'''
======== Association Rule Searching ========
H: a list of items that could be on the right-hand side of a rule
'''
def calcConf(freqSet, H, supportData, brl, minConf=0.7):
prunedH = []
for conseq in H:
conf = supportData[freqSet] / supportData[freqSet - conseq]
if conf >= minConf:
print (set(freqSet - conseq), '-->', set(conseq), 'conf:', conf * 100, '%')
brl.append((freqSet - conseq, conseq, conf))
prunedH.append(conseq)
return prunedH
def rulesFromConseq(freqSet, H, supportData, brl, minConf = 0.7):
m = len(H[0])
if (len(freqSet) > (m + 1)):
Hmp1 = aprioriGen(H, m + 1) # Gen list of next iteration
Hmp1 = calcConf(freqSet, Hmp1, supportData, brl, minConf) # pruning. pick qualified rules.
if (len(Hmp1) > 1):
rulesFromConseq(freqSet, Hmp1, supportData, brl, minConf) # Continue\Iterate to next level
# L: a set of freqent itemset; sorted by length
def generateRules(L, supportData, minConf = 0.7):
bigRuleList = []
for i in range(1, len(L)): # from length 2
#print ('Apriori Rule ', i)
for freqSet in L[i]:
H1 = [frozenset([item]) for item in freqSet] # {0,1,2} -> [{0},{1},{2}].
# Build from size 1 on right-hand side
if (i > 1): # length > 2, go level by level
rulesFromConseq(freqSet, H1, supportData, bigRuleList, minConf)
else: # if only 2 items, just prune - the base
calcConf(freqSet, H1, supportData, bigRuleList, minConf)
return bigRuleList
```

作者：saintony 发表于 2016/03/02 09:10:30 原文链接 https://blog.csdn.net/saintony/article/details/50777278

阅读：682

]]>
But there are limitations for G++ to compile template like recursion depth limitation and greediness... Please find comments below to get more details.

```
/*
Works with following g++ commands by G++ 4.8.1
g++ -g -c riddle_meta.cpp -std=c++11 -ftemplate-depth=3000
g++ -o riddle_meta.exe riddle_meta.o -pg
*/
#include <iostream>
using namespace std;
#define MID(a, b) ((a+b)/2)
#define POW(a) (a*a)
// Calculate Square Root using binary search
//
template <int v, int l, int r>
class SQRT
{
static const int mid = MID(r, l);
static const int mid_pow = POW(mid);
static const int nl = mid_pow >= v ? l : mid + 1;
static const int nr = mid_pow >= v ? mid : r;
public:
static const int value = SQRT<v, nl, nr>::value;
};
template<int v, int l = 1, int r = v>
class SQRT;
template<int v, int r>
class SQRT<v, r, r>
{
public:
static const int value = r;
};
// Perfect Square Checking
//
template<int VAL>
class PSQRT
{
static const int sqrt = SQRT<VAL>::value;
public:
static const bool value = (sqrt * sqrt) == VAL;
};
// Prime Number Checking
//
template<int VAL, int DIV>
class PRIME
{
public:
static const bool value = (VAL % DIV == 0) ? false : PRIME<VAL, (DIV%2)?(DIV-2):(DIV-1)>::value;
};
template<int VAL>
class PRIME<VAL, 2>
{
public:
static const bool value = VAL % 2 == 1;
};
template<int VAL>
class PRIME<VAL, 3>
{
public:
static const bool value = VAL % 3 != 0;
};
// Goldbach other Conjecture Checking
//
template<int VAL, int P>
class Goldbach
{
static const int next_odd = (P%2)?(P-2):(P-1);
public:
static const bool value = (!PRIME<P, SQRT<P>::value>::value) ?
Goldbach<VAL, next_odd>::value: // if P is not prime, we try next odd number
(PSQRT<(VAL-P)/2>::value ? true: (Goldbach<VAL, next_odd>::value)
);
};
template<int VAL>
class Goldbach<VAL, 2>
{
public:
static const bool value = PSQRT<((VAL-2)/2)>::value;
};
template<int VAL>
class Goldbach<VAL, 3>
{
public:
static const bool value = PSQRT<((VAL-3)/2)>::value;
};
template<>
class Goldbach<3, 1>
{
public:
static const bool value = true;
};
// Main Loop: check odd numbers one by one, starting from VAL
//
template<int VAL>
class Driver
{
public:
// HACK: With primality checking expression enabled, G++ suffers from out of memory error.
static const int value = ( /*(!PRIME<VAL, VAL - 2>::value) ||*/ Goldbach<VAL, VAL-2>::value) ? (Driver<VAL + 2>::value) : VAL;
};
// HACK: G++ doesn't initialize template lazily, so there must be an ending criteria for upper-bound
template<>
class Driver<5801>
{
public:
static const int value = 5801;
};
//
int main()
{
// HACK: there's memory limitation when G++ compiles templates
std::cout << Driver<5651>::value << endl;
return 0;
}
```

作者：saintony 发表于 2015/07/19 02:36:59 原文链接 https://blog.csdn.net/saintony/article/details/46949627

阅读：1918

]]>
Not a hard one to code, but it can be optimized using SSE2 instructions. The code below runs with g++ 4.8.1:

g++ -g -c riddle.cpp -std=c++11 -msse2 -pg

g++ -o riddle.exe riddle.o -pg

objdump -d -M intel -S riddle.o > assembly.txt

riddle

gprof riddle.exe gmon.out > report.txt

And here is the code:

```
#if defined(__SSE2__)
#include <xmmintrin.h> //SSE
#include <emmintrin.h> //SSE2
#endif
#include <ctime>
#include <cstdio>
#include <iostream>
#include <chrono>
using namespace std;
typedef unsigned long UL;
#define MAX_PRIME_CNT 1000
#define MAX_CNT 10001
extern int primes[MAX_PRIME_CNT];
bool SquareM[MAX_CNT] = {false};
int startPrimeInx[MAX_CNT];
#if defined(__SSE2__)
// Debug only
void printM128I(const __m128i &v)
{
unsigned* p = (unsigned*)&v;
cout << p[0] << ":" << p[1] << ":" << p[2] << ":" << p[3] << endl;
}
// Calculate a[i] * b[i], with i[0..3]
int sse_v[4] = {0};
inline __m128i pwr2_sse(const int &a, const int &b)
{
sse_v[0] = a; sse_v[2] = b;
__m128i mv = _mm_loadu_si128 ((__m128i *)sse_v);
return _mm_mul_epu32(mv, mv);
}
// Calculate (a[i] - b[i]) >> 1, with i[0..3]
inline __m128i sub4_and_shl1_sse(int a[4], int *b)
{
__m128i va = _mm_loadu_si128 ((__m128i *)a);
__m128i vp = _mm_loadu_si128 ((__m128i *)b);
return _mm_srli_epi32(_mm_sub_epi32(va, vp), 1);
}
#endif
int main()
{
auto start = std::chrono::high_resolution_clock::now();
// Mark Perfect Square Numbers
int vec4[4] = {0};
for (int i = 0; i < 100; i +=4)
{
#if defined(__SSE2__)
__m128i r = pwr2_sse(i, i + 1);
unsigned* val = (unsigned*) &r;
SquareM[val[0]] = SquareM[val[2]] = true;
r = pwr2_sse(i + 2, i + 3);
val = (unsigned*) &r;
SquareM[val[0]] = SquareM[val[2]] = true;
#else
SquareM[ i * i ] =
SquareM[(i + 1) * (i + 1)] =
SquareM[(i + 2) * (i + 2)] =
SquareM[(i + 3) * (i + 3)] = true;
#endif
}
// Pre-calculate start Prime index
register UL prevPrime, currPrime;
for (int i = 1; i < MAX_PRIME_CNT; i ++)
{
prevPrime = primes[i - 1];
currPrime = primes[i];
startPrimeInx[prevPrime] = -2;
for(int j = prevPrime + 2; j < currPrime; j +=2) // skip all evens
startPrimeInx[j] = i - 1;
}
// Main Logic
register UL v = 1;
register int offset;
while(v += 2)
{
//register int offset = find1stSmallerPrime(v);
offset = startPrimeInx[v]; // pre-calculate it..
while(offset >= 0)
{
#if defined(__SSE2__)
// If we still have more than 4 primes to check,
// we use SSE2 ins to check 4 primes all together
if(offset > 4)
{
int vv[4] = {v,v,v,v};
__m128i r = sub4_and_shl1_sse(vv, primes + offset - 3);
unsigned * pinx = (unsigned *)&r;
if(SquareM[pinx[3]] || SquareM[pinx[2]] || SquareM[pinx[1]] || SquareM[pinx[0]])
break;
offset -= 4;
}
else
{
if(SquareM[(v - primes[offset]) >> 1]) break;
offset--;
}
#else
if(SquareM[(v - primes[offset]) >> 1]) break;
offset--;
#endif
}
if(offset == -1) break;
}
printf("%lu\n", v);
// Output time spent in milli-seconds
auto end = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> diff = end-start;
cout << "Time(in second): "<<diff.count() << endl;
return 0;
}
// Pre-loaded Primes
// Memory-Performance exchange
//
int primes[MAX_PRIME_CNT] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29
, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71
, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113
, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173
, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229
, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281
, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349
, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409
, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463
, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541
, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601
, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659
, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733
, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809
, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863
, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941
, 947, 953, 967, 971, 977, 983, 991, 997, 1009, 1013
, 1019, 1021, 1031, 1033, 1039, 1049, 1051, 1061, 1063, 1069
, 1087, 1091, 1093, 1097, 1103, 1109, 1117, 1123, 1129, 1151
, 1153, 1163, 1171, 1181, 1187, 1193, 1201, 1213, 1217, 1223
, 1229, 1231, 1237, 1249, 1259, 1277, 1279, 1283, 1289, 1291
, 1297, 1301, 1303, 1307, 1319, 1321, 1327, 1361, 1367, 1373
, 1381, 1399, 1409, 1423, 1427, 1429, 1433, 1439, 1447, 1451
, 1453, 1459, 1471, 1481, 1483, 1487, 1489, 1493, 1499, 1511
, 1523, 1531, 1543, 1549, 1553, 1559, 1567, 1571, 1579, 1583
, 1597, 1601, 1607, 1609, 1613, 1619, 1621, 1627, 1637, 1657
, 1663, 1667, 1669, 1693, 1697, 1699, 1709, 1721, 1723, 1733
, 1741, 1747, 1753, 1759, 1777, 1783, 1787, 1789, 1801, 1811
, 1823, 1831, 1847, 1861, 1867, 1871, 1873, 1877, 1879, 1889
, 1901, 1907, 1913, 1931, 1933, 1949, 1951, 1973, 1979, 1987
, 1993, 1997, 1999, 2003, 2011, 2017, 2027, 2029, 2039, 2053
, 2063, 2069, 2081, 2083, 2087, 2089, 2099, 2111, 2113, 2129
, 2131, 2137, 2141, 2143, 2153, 2161, 2179, 2203, 2207, 2213
, 2221, 2237, 2239, 2243, 2251, 2267, 2269, 2273, 2281, 2287
, 2293, 2297, 2309, 2311, 2333, 2339, 2341, 2347, 2351, 2357
, 2371, 2377, 2381, 2383, 2389, 2393, 2399, 2411, 2417, 2423
, 2437, 2441, 2447, 2459, 2467, 2473, 2477, 2503, 2521, 2531
, 2539, 2543, 2549, 2551, 2557, 2579, 2591, 2593, 2609, 2617
, 2621, 2633, 2647, 2657, 2659, 2663, 2671, 2677, 2683, 2687
, 2689, 2693, 2699, 2707, 2711, 2713, 2719, 2729, 2731, 2741
, 2749, 2753, 2767, 2777, 2789, 2791, 2797, 2801, 2803, 2819
, 2833, 2837, 2843, 2851, 2857, 2861, 2879, 2887, 2897, 2903
, 2909, 2917, 2927, 2939, 2953, 2957, 2963, 2969, 2971, 2999
, 3001, 3011, 3019, 3023, 3037, 3041, 3049, 3061, 3067, 3079
, 3083, 3089, 3109, 3119, 3121, 3137, 3163, 3167, 3169, 3181
, 3187, 3191, 3203, 3209, 3217, 3221, 3229, 3251, 3253, 3257
, 3259, 3271, 3299, 3301, 3307, 3313, 3319, 3323, 3329, 3331
, 3343, 3347, 3359, 3361, 3371, 3373, 3389, 3391, 3407, 3413
, 3433, 3449, 3457, 3461, 3463, 3467, 3469, 3491, 3499, 3511
, 3517, 3527, 3529, 3533, 3539, 3541, 3547, 3557, 3559, 3571
, 3581, 3583, 3593, 3607, 3613, 3617, 3623, 3631, 3637, 3643
, 3659, 3671, 3673, 3677, 3691, 3697, 3701, 3709, 3719, 3727
, 3733, 3739, 3761, 3767, 3769, 3779, 3793, 3797, 3803, 3821
, 3823, 3833, 3847, 3851, 3853, 3863, 3877, 3881, 3889, 3907
, 3911, 3917, 3919, 3923, 3929, 3931, 3943, 3947, 3967, 3989
, 4001, 4003, 4007, 4013, 4019, 4021, 4027, 4049, 4051, 4057
, 4073, 4079, 4091, 4093, 4099, 4111, 4127, 4129, 4133, 4139
, 4153, 4157, 4159, 4177, 4201, 4211, 4217, 4219, 4229, 4231
, 4241, 4243, 4253, 4259, 4261, 4271, 4273, 4283, 4289, 4297
, 4327, 4337, 4339, 4349, 4357, 4363, 4373, 4391, 4397, 4409
, 4421, 4423, 4441, 4447, 4451, 4457, 4463, 4481, 4483, 4493
, 4507, 4513, 4517, 4519, 4523, 4547, 4549, 4561, 4567, 4583
, 4591, 4597, 4603, 4621, 4637, 4639, 4643, 4649, 4651, 4657
, 4663, 4673, 4679, 4691, 4703, 4721, 4723, 4729, 4733, 4751
, 4759, 4783, 4787, 4789, 4793, 4799, 4801, 4813, 4817, 4831
, 4861, 4871, 4877, 4889, 4903, 4909, 4919, 4931, 4933, 4937
, 4943, 4951, 4957, 4967, 4969, 4973, 4987, 4993, 4999, 5003
, 5009, 5011, 5021, 5023, 5039, 5051, 5059, 5077, 5081, 5087
, 5099, 5101, 5107, 5113, 5119, 5147, 5153, 5167, 5171, 5179
, 5189, 5197, 5209, 5227, 5231, 5233, 5237, 5261, 5273, 5279
, 5281, 5297, 5303, 5309, 5323, 5333, 5347, 5351, 5381, 5387
, 5393, 5399, 5407, 5413, 5417, 5419, 5431, 5437, 5441, 5443
, 5449, 5471, 5477, 5479, 5483, 5501, 5503, 5507, 5519, 5521
, 5527, 5531, 5557, 5563, 5569, 5573, 5581, 5591, 5623, 5639
, 5641, 5647, 5651, 5653, 5657, 5659, 5669, 5683, 5689, 5693
, 5701, 5711, 5717, 5737, 5741, 5743, 5749, 5779, 5783, 5791
, 5801, 5807, 5813, 5821, 5827, 5839, 5843, 5849, 5851, 5857
, 5861, 5867, 5869, 5879, 5881, 5897, 5903, 5923, 5927, 5939
, 5953, 5981, 5987, 6007, 6011, 6029, 6037, 6043, 6047, 6053
, 6067, 6073, 6079, 6089, 6091, 6101, 6113, 6121, 6131, 6133
, 6143, 6151, 6163, 6173, 6197, 6199, 6203, 6211, 6217, 6221
, 6229, 6247, 6257, 6263, 6269, 6271, 6277, 6287, 6299, 6301
, 6311, 6317, 6323, 6329, 6337, 6343, 6353, 6359, 6361, 6367
, 6373, 6379, 6389, 6397, 6421, 6427, 6449, 6451, 6469, 6473
, 6481, 6491, 6521, 6529, 6547, 6551, 6553, 6563, 6569, 6571
, 6577, 6581, 6599, 6607, 6619, 6637, 6653, 6659, 6661, 6673
, 6679, 6689, 6691, 6701, 6703, 6709, 6719, 6733, 6737, 6761
, 6763, 6779, 6781, 6791, 6793, 6803, 6823, 6827, 6829, 6833
, 6841, 6857, 6863, 6869, 6871, 6883, 6899, 6907, 6911, 6917
, 6947, 6949, 6959, 6961, 6967, 6971, 6977, 6983, 6991, 6997
, 7001, 7013, 7019, 7027, 7039, 7043, 7057, 7069, 7079, 7103
, 7109, 7121, 7127, 7129, 7151, 7159, 7177, 7187, 7193, 7207
, 7211, 7213, 7219, 7229, 7237, 7243, 7247, 7253, 7283, 7297
, 7307, 7309, 7321, 7331, 7333, 7349, 7351, 7369, 7393, 7411
, 7417, 7433, 7451, 7457, 7459, 7477, 7481, 7487, 7489, 7499
, 7507, 7517, 7523, 7529, 7537, 7541, 7547, 7549, 7559, 7561
, 7573, 7577, 7583, 7589, 7591, 7603, 7607, 7621, 7639, 7643
, 7649, 7669, 7673, 7681, 7687, 7691, 7699, 7703, 7717, 7723
, 7727, 7741, 7753, 7757, 7759, 7789, 7793, 7817, 7823, 7829
, 7841, 7853, 7867, 7873, 7877, 7879, 7883, 7901, 7907, 7919 };
```

作者：saintony 发表于 2015/07/18 15:49:44 原文链接 https://blog.csdn.net/saintony/article/details/46943211

阅读：697

]]>
You may have already read this great article: http://betterexplained.com/articles/a-visual-intuitive-guide-to-imaginary-numbers/

After you get geometric interpretation to (i, j, k), everything becomes crystal clear.

作者：saintony 发表于 2015/03/24 01:11:37 原文链接 https://blog.csdn.net/saintony/article/details/44578715

阅读：458

]]>
**Step 1: Write your Haskell code**

```
{-# LANGUAGE ForeignFunctionInterface #-}
module GfxHaskellExt where
import Foreign
import Foreign.C.Types
foreign export ccall "my_haskell_call" my_haskell_call :: CInt -> IO ()
my_haskell_call :: CInt -> IO()
my_haskell_call n = do
putStrLn "It is a HI from Haskell. Hello C"
print (n + 250)
```

(directly from http://mostlycode.wordpress.com/2010/01/03/shared-haskell-so-library-with-ghc-6-10-4-and-cabal/)

```
#define CAT(a,b) XCAT(a,b)
#define XCAT(a,b) a ## b
#define STR(a) XSTR(a)
#define XSTR(a) #a
#include <HsFFI.h>
extern void CAT (__stginit_, MODULE) (void);
static void library_init (void) __attribute__ ((constructor));
static void library_init (void)
{
/* This seems to be a no-op, but it makes the GHCRTS envvar work. */
static char *argv[] = { STR (MODULE) ".dll", 0 }, **argv_ = argv;
static int argc = 1;
hs_init (&argc, &argv_);
hs_add_root (CAT (__stginit_, MODULE));
}
static void library_exit (void) __attribute__ ((destructor));
static void library_exit (void)
{
hs_exit ();
}
```

```
name: my-haskell-exts
version: 0.3.0
license: BSD3
copyright: (c) Intel Corporation
author: John Doe
maintainer: John Doe <John.Doe@heaven.com>
stability: experimental
synopsis: Test Dll
description: Experimental project
category: Test
build-type: Simple
cabal-version: >= 1.6
executable HaskellExts.dll
build-depends: base == 4.*
hs-source-dirs: src
ghc-options: -optl-shared -optc-DMODULE=GfxHaskellExt -no-hs-main
main-is: HaskellExts.hs
c-sources: src/module_init.c
include-dirs: src
install-includes: HaskellExts.h
cc-options: -DMODULE=GfxHaskellExt -shared
ld-options: -shared
```

- Pur your src files (.hs, .c) in a folder named 'src', and put .cabal in the same folder.

- In cmd.exe, run 'cabal configure' first, if succeded, run 'cabal build'

Step 5: Write your C application from which you wanna use Haskell code:

```
#include "windows.h"
typedef void (_cdecl* HsCalltype)(int);
int main()
{
HMODULE dllHandle = LoadLibrary("HaskellExts.dll");
if (dllHandle != NULL)
{
HsCalltype pfunc = (HsCalltype)GetProcAddress(dllHandle, "my_haskell_call");
if(pfunc != NULL)
{
pfunc(100);
}
FreeLibrary(dllHandle);
}
return 0;
}
```

Please note: use "_cdecl" to declare the Haskell function. And, TADA !

作者：saintony 发表于 2014/06/28 05:51:08 原文链接 https://blog.csdn.net/saintony/article/details/35498999

阅读：858

]]>
<Learn you a Haskell for great good>

<Real World Haskell>

I took some time climbing up along this steep learning curve, and finally got a bird's view on Haskell, by the above books.

"The better your C++ skill is, the harder you feel learning Haskell". True, but only before you find the correct way to learn it - Brush up your Maths thoughts and inspect Haskell using this mathematical microscope. Haskell compiler is a theorem proofer, so your haskell code is purely maths equations - only Haskell is 100% pure.

Lazyness? You will need calculate Equation A only when you need it. Monad? operations are a group. Or, the best quote on Monad ever, from Real World Haskell: Monads are programmable semicolons. When you program Haskell, you are writing equations actually.

OK, back to the books. LYHFGG is the best book for beginners. Author is a young student and he possesses excellent capability of elaboration. All concepts are explained in verbose. It fills every gap in your mind. If you read through it carefully, you gets its ideas. Though, LYHFGG focuses on language itself heavily, but surprisingly there's no section on Monad Transform, or no real world applications. After LYHFGG, you will soon move on to RWH naturally.

Wording of RWH is not as pleasant as the former, but it contains quite a lot useful information. It has bette Monad chapters than LYHFGG. The best of it is that RWH focuses on real world application, which is demanded. You can't grasp RWH by simply reading it. You have to get back to it frequently while you code, until you don't need it anymore :)

作者：saintony 发表于 2014/06/20 14:00:22 原文链接 https://blog.csdn.net/saintony/article/details/32708637

阅读：650

]]>
```
#include "boost/preprocessor/seq/size.hpp"
#include "boost/preprocessor/seq/elem.hpp"
#include "boost/preprocessor/seq/push_back.hpp"
#include "boost/preprocessor/stringize.hpp"
#include "boost/preprocessor/iteration/local.hpp"
#include "boost/preprocessor/arithmetic/add.hpp"
#include "boost/preprocessor/arithmetic/div.hpp"
#include "boost/preprocessor/arithmetic/mul.hpp"
#include "boost/preprocessor/arithmetic/dec.hpp"
#include "boost/preprocessor/arithmetic/mod.hpp"
#include "boost/preprocessor/debug/assert.hpp"
#include "boost/preprocessor/comparison/equal.hpp"
#define SEQ_DIM 4
#define SEQ0 (a)(b)(c)(d) \
(a1)(b1)(c1)(d1)
BOOST_PP_ASSERT_MSG(BOOST_PP_EQUAL(BOOST_PP_MOD(BOOST_PP_SEQ_SIZE(SEQ0), SEQ_DIM), 0), \
"#error SEQ has to be aligned with 4")
#define BOOST_PP_LOCAL_MACRO(n) printf("%s %s %s %s\n", \
BOOST_PP_STRINGIZE(BOOST_PP_SEQ_ELEM(BOOST_PP_ADD(BOOST_PP_MUL(n, SEQ_DIM), 0), SEQ0)), \
BOOST_PP_STRINGIZE(BOOST_PP_SEQ_ELEM(BOOST_PP_ADD(BOOST_PP_MUL(n, SEQ_DIM), 1), SEQ0)), \
BOOST_PP_STRINGIZE(BOOST_PP_SEQ_ELEM(BOOST_PP_ADD(BOOST_PP_MUL(n, SEQ_DIM), 2), SEQ0)), \
BOOST_PP_STRINGIZE(BOOST_PP_SEQ_ELEM(BOOST_PP_ADD(BOOST_PP_MUL(n, SEQ_DIM), 3), SEQ0)) );
#define BOOST_PP_LOCAL_LIMITS (0, BOOST_PP_DEC(BOOST_PP_DIV(BOOST_PP_SEQ_SIZE(SEQ0), SEQ_DIM)))
#include BOOST_PP_LOCAL_ITERATE()
```

作者：saintony 发表于 2014/06/04 12:52:45 原文链接 https://blog.csdn.net/saintony/article/details/28412085

阅读：1399

]]>
1. Put -g -O0 -fprofile-arcs -ftest-coverage to your CXXFLAG in your makefile. (--coverage doesn't work for me)

2. Put -lgcov to LFLAGS

3. gmake your project and run your binaries

4. In case your intermediate output folder is different with your src folder, copy *.gcda\*.gcno\*.o(*.o?) files to the same folder of src with the same names

5. lcov --capture -directory ../../EACH_DIR --output-file YOUR_PRJ_NAME.info, at your binary path.

6. If needed, converge all your .cov files into one: lcov -a 1.cov -a 2.cov... -o n.gov

7. genhtml n.cov --output-directory WEB_PATH

Important note: geninfo in lcov 1.9- has a hanging bug. Please make sure you use 1.10+.

作者：saintony 发表于 2014/04/24 14:30:14 原文链接 https://blog.csdn.net/saintony/article/details/24411747

阅读：584

]]>
http://www.muppetlabs.com/~breadbox/bf/

http://www.hevanet.com/cristofd/brainfuck/

And the best online interpreter using JavaScript: http://nayuki.eigenstate.org/page/brainfuck-interpreter-javascript

And this is qsort in this language (reference: http://codegolf.stackexchange.com/questions/2445/implement-quicksort-in-brainf)

```
>>>>>>>>,[>,]<[[>>>+<<<-]>[<+>-]<+<]>[<<<<<<<<+>>>>>>>>-]<<<<<<<<[[>>+
>+>>+<<<<<-]>>[<<+>>-]<[>+>>+>>+<<<<<-]>[<+>-]>>>>[-<->]+<[>->+<<-[>>-
<<[-]]]>[<+>-]>[<<+>>-]<+<[->-<<[-]<[-]<<[-]<[[>+<-]<]>>[>]<+>>>>]>[-<
<+[-[>+<-]<-[>+<-]>>>>>>>>[<<<<<<<<+>>>>>>>>-]<<<<<<]<<[>>+<<-]>[>[>+>
>+<<<-]>[<+>-]>>>>>>[<+<+>>-]<[>+<-]<<<[>+>[<-]<[<]>>[<<+>[-]+>-]>-<<-
]>>[-]+<<<[->>+<<]>>[->-<<<<<[>+<-]<[>+<-]>>>>>>>>[<<<<<<<<+>>>>>>>>-]
<<]>[[-]<<<<<<[>>+>>>>>+<<<<<<<-]>>[<<+>>-]>>>>>[-[>>[<<<+>>>-]<[>+<-]
<-[>+<-]>]<<[[>>+<<-]<]]>]<<<<<<-]>[>>>>>>+<<<<<<-]<<[[>>>>>>>+<<<<<<<
-]>[<+>-]<+<]<[[>>>>>>>>+<<<<<<<<-]>>[<+>-]<+<<]>+>[<-<<[>+<-]<[<]>[[<
+>-]>]>>>[<<<<+>>>>-]<<[<+>-]>>]<[-<<+>>]>>>]<<<<<<]>>>>>>>>>>>[.>]
```

Probably there are two ways to decode a Brainf*ck program properly:

1. Use a Brainf*ck->C converter

2. A seasoned TCS phd student whose research focus is Turing machine

作者：saintony 发表于 2014/02/19 06:28:55 原文链接 https://blog.csdn.net/saintony/article/details/19454669

阅读：765

]]>
First half of chapter 1 "Recurrent problems" gives you a very important interview skill: "start from simple cases" and try to find a pattern out of it - this is what I learnt from "Programming Interviews Exposed". It works very well. But this is a maths book so generalization method is introduced and also applied to Josephus problem. This is the best texts that connects computer science with mathematics. It reminds me a wonderful part in Sipser's in which how an algorithm is decoded to a Turing machine's language.

Chapter 2 handles all kinds of SUMs. I never knew there are so many interesting facts about SUM - which I considered to be intuitive and not very brain demanding.

So far, nothing is out of knowledge of early years of undergrad. But it is simply fun and amazing.

作者：saintony 发表于 2013/11/11 00:51:03 原文链接 https://blog.csdn.net/saintony/article/details/15027559

阅读：533

]]>
Not sure how much I've digested and remembered about this book (considering my experience of it: 5 months in spare time), but what i have learned is still enormous: the way Sipser conducted his proofs, and in future where I can refer to in case of any TCS related topics appeared in my work - though Pr <= 2^(-n) :P

作者：saintony 发表于 2013/10/26 05:06:08 原文链接 https://blog.csdn.net/saintony/article/details/13089633

阅读：591

]]>
It is called C++ 'GRANDMASTER' certification. Sounds pretty ambitious and challenging doesn't it? Basically you are supposed to accomplish a complete compiler tool chain, from scratch. Interestingly I had the same ambition 2 years ago. I finished the lexical part of the Dragon Book and also my own lexical analyzer, but after careful consideration I decided to quit. Why? Walking through a path that a lot others have already been through is not the best choice, when you are capable to figure out new tools\ideas, which is much better efforts investment.

And talking about this CPPGM: it has everything except optimization - geeze, that's the pearl of compiler !

I share the same thoughts as this guy: Why I’m quitting the C++ Grandmaster Certification course

作者：saintony 发表于 2013/08/20 00:46:08 原文链接 https://blog.csdn.net/saintony/article/details/10089809

阅读：1104

]]>
Turing machine has a magic 'Tape' which acts like a RAM, an infinite RAM - R\W, moving forwards\backwards. It is much more powerful than stacks used by PDA and here comes the capacity of computation. A typical TM utilize only one tape, and a multi-tape TM has the same capacity comparing with a one-Tape only TM. Why? Just merge those tapes into one tape and modify your transition function. Non-deterministic TM: it reproduces itself by the number of possibilities and explores each branch until an ACCEPT is encountered. And so far, except Quantum Computer, all other computational models can be reducted to a TM.

I never knew there's lower level than an algorithm, well, a Turing machine is. By encoding data structure used by an algorithm, TM can run that codes - all right, Shannon's information theory here..

作者：saintony 发表于 2013/07/23 01:37:56 原文链接 https://blog.csdn.net/saintony/article/details/9416477

阅读：1656

]]>
Combining GNFA|Regular lang., readers should be able to understand PDA|Ctx-Free grammar without too much effort.. Yes, this is what you saw in your compiler classes. With (G)NFA existed, a stack is added for "recording extra info", so to make PushDown Automata more powerful than GNFA, because it "recording extra info" in that stack.

Make a comparison btw. GNFA and PDA: GNFA is so simple that a lot info is simply "forgotten" after each movement - the visibility of GNFA is limited to "neighbor" states; but with a stack, we have memories of information from the past ! That means, the "linear" style of GNFA (I know, it could be a graph, but it is still directive) becomes a tree, a grammar tree.

The big star of the 2 chapters: Pumping lemma. It is adopted to prove some language is NOT a Regular | Context-Free language. Essentially a pumping style (no, not gangnam style) reflects its corresponding gist of GNFA or Context-Free grammar: GNFA is essentially a linear (as above) procedure so its pumping is a linear pumping; while Ctx-Free pumping lemma is a recursive (stack like) type of pumping. Smart maths tricks are applied to the 3 conditions in these lemmas. Contradiction from Ch.0 is used as the proof strategy.

P.S. I don't like terminology in this field from the very beginning. Language? Grammar? They remind me of ETS - Educational Testing Service. The first big boss you conquered before your first flight ticket to US.

Anyway, Michael Sipser is an excellent teacher. He knows how students learn theory of computation. For sure he had assignments to his students, which part I totally skipped - don't blame me i have no plan to be a professor.

作者：saintony 发表于 2013/07/16 14:50:57 原文链接 https://blog.csdn.net/saintony/article/details/9342495

阅读：5236

]]>
**Ch. 0 Introduction** - on several very basic maths basis that is the premise of continuing your study;

**Ch. 1 Regular Language **- Staring with regular expression that every CSer knows about, author elaborates Finite Automata, DFA\(G)NFA with a extremely accessible way to explanation, including their properties, conversions among each other.
What's more, everything is guarded by rigorous MATHS proofs, so you would have fun concisely, without any theoretical rigor loss. The JOY of this chapter is: FA is a fun toy, it is flexible but follows precise rules of Maths. :P

Seriously, can't wait to read on !

作者：saintony 发表于 2013/07/08 14:02:57 原文链接 https://blog.csdn.net/saintony/article/details/9271233

阅读：672

]]>
I saw this book in bookstore first, instead of Amazon\China-Pub that is the usual way I got to know new books.

I was amazed by this book and became happier and happier while I was reading this book. It belongs to the "interest-based" category of Chinese books which are very rare.

All maths\theory based in the book are not too profound; most can be categorized into basic undergrad courses - like linear algebra, Probability, Information theory, number theory and high school maths etc. But the author does a great job that combining these boring maths with real world applications, fulfilled with huge fun of MATHS. Have you doubted that linear algebra you've learnt in class is useless? Try chapters on Google; Wanna a much smarter way to implement IsInSet()? Try bloom filter - really concise and beautiful solution - then I got to know the power and beauty of MATHS.

There are numerous sparkling spots in the book what you will enjoy them all - Markov Chain & Bayesian network, Viterbi algorithm & dynamic programming, app of probability, etc. All of these are in NLP realm, so you must be enthusiastic to explore MATHS in your focus area - if your field is not NLP. Even if you work in industry, you must be excited to see how MATHS could do magics to your daily work :)

作者：saintony 发表于 2013/07/03 23:48:07 原文链接 https://blog.csdn.net/saintony/article/details/9238513

阅读：559

]]>
This article contains a simple review on existing open sourced Register Allocation algorithms, with GCC and LLVM.

1. Traditional, general Register Allocation algorithm - Graph Coloring

In input language, there exists an invisible reference\dependent graph among registers and instructions. To avoid conflictions, different vars should reside in different registers. Essentially it is a graph coloring algorithm, which is a NP-hard problem. This is not the focus of this review. Please refer to http://www.cs.cmu.edu/~dkoes/research/graphraTR.pdf

2. GCC vs. LLVM

GCC buys the graph coloring algorithm, but it seems not easy to do this job: http://gcc.gnu.org/wiki/RegisterAllocation. But LLVM decides to adopt simpler but more efficient methods to implement RA problem: linear scanning + spiller strategy, and it says it works very well, with 1-2% smaller memory occupation and up to 10% faster code execution. And in LLVM, Greedy RA algorithm is defaulted, which is based on the linear scanning strategy, in which reg life intervals are basic tools to operate.

In perspective of engineering, it is LLVM that is a better choice indeed - simpler, more capable. http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html

作者：saintony 发表于 2013/05/10 04:35:08 原文链接 https://blog.csdn.net/saintony/article/details/8908806

阅读：722

]]>
作者：saintony 发表于 2012/01/18 15:39:32 原文链接 https://blog.csdn.net/saintony/article/details/7209351

阅读：473

]]>
作者：saintony 发表于 2011/08/28 03:11:03 原文链接 https://blog.csdn.net/saintony/article/details/6725834

阅读：489

]]>