队伍介绍
我们是来自成渝赛区电子科技大学的队伍,队伍名称:传说中的傲气仔,初赛成绩0.1561,成渝赛区第3。复赛练习赛成绩2.8456,成渝赛区第3。复赛正赛连续5发WA,比赛结束。本来想借此比赛不用秋招,无奈未进决赛,只能老老实实复习刷题准备找工作。
我们队伍的方案与大家的方案大同小异,本来忙于复习,没有时间,嫌麻烦不准备开源,但是中午在宿舍没有睡意,就抽空码一下吧。
初赛
赛题介绍
赛题链接:赛题地址
简要来说就是在一个有向图里寻找长度为3,4,5,6,7的所有的环,并按照字典序输出到指定文件中。对于参加了比赛的同学来说,我就不过多介绍赛题了。
方案
要按照字典序输出,我们就对点排序后从小到大,依次寻找每个点的环,每次找环时只找比第一个点大的点,然后采用DFS进行找环,因为初赛数据量很小,所以IO占比很大,所以IO也是一个值得优化的地方,但是自己实在是太菜了,实在不知道如何优化IO。
我们的方案时根据知乎大佬开源的baseline方案的基础上优化的,感兴趣的可以去看看大佬的baseline方案(知乎大佬进决赛了,膜一下),在此感谢大佬的开源。没有看过这个baseline的建议看一下,犹记得自己第一发提交3.4645分,然后知乎大佬的baseline线上2.7分。
因为找环的最长长度是7,最简单的方法就是直接dfs七层,但这样肯定是最慢的,baseline的方案是反向构造P2,即对于某一个点begin,我们要找以begin开头的环,知乎的方案是记录下经过两层能到begin的点mid,即mid->next->begin,这样只需要遍历5层,然后查询能否经过两层到达begin,即判断第5层的点是否位于mid中,且会记录中间经过的点next。这样就是5+2的方案,我们在P2基础上构造了P3,即end->mid->next->begin,记录能到begin的end的点的信息即mid和next,这样方案就变成了4+3。
大家的方案应该都是4+3,就看各自的具体实现了。大家从最开始的vector和map到最终的全部改为数组,基本的一些信息都是知道的,比如一些线上出度最大不超过50,入度最大不超过255,常用剪枝策略拓扑排序等,我就不过多重复了。
具体优化点
- 读完数据后,要对数据进行一个排序,sort很耗时的,在ddd大佬公开线上数据只有5w以内的点存在环之后,我们就把这个sort换成了计数排序;
- 相比于baseline的方案,我们在构造P2P3的时候使用前向邻接表,这样会更快一些;
- 在构造完后向邻接表后,在此基础上构造前向邻接表;
- 不再是遍历点之前构造P2P3,因为这样需要使用map,map很耗时。而是在遍历到某一个点时,针对这个点构造P2P3,这样就可以改成数组来存了;
- 在构造完P2后,在P2的基础上构造P3,而没有采用直接加一层循环构造P3,这是因为P2有序,那么在P2的基础上构造P3,就能保证P3也有序;如果是多一层循环构造P3,需要对P3排序,这会很耗时;
- 多线程时,对每一个线程都开辟一块空间,当每个线程里找到结果时,就用memcpy将结果写入该线程的结果空间里,在找环全部结束后,再将每一个线程的结果空间里的字符按顺序写进文件里,这样就实现了找环和写文件的多线程;
- 优化了一点memcpy,由于自己也不懂,就直接百度优化memcpy,然后在网上抄的别人的代码(算是一个小惊喜),线上成绩由0.1939上升到0.1749,提升了将近0.02,真的是个意外之喜,几乎是啥都没做。
- 还有一个就是多线程我们的方案也是手动调参,给每个线程分配点数,其实调参的时候队友就发现前面的点环数特别多,但我们并没有意识到大于5w的点不存在环,没办法,谁让自己菜呢。
代码
代码地址:贴一个最终版的代码
待进步的地方
- 在通过后向邻接表进行递归的时候应该还可以剪枝,这在我们做复赛的时候大佬提醒了我们;
- 可以把递归改成for套娃,减少函数形参的复制
- 自己的多线程是copy很多份函数来实现的,每次改动都需要复制粘贴很多份,特别麻烦(在复赛前期的时候依然是这种方式),代码长度一度达到2000+行,感慨自己的愚蠢,大家将就着看吧。
看到自己的成绩一点点地进步,还是很有成就感的,主要还是感谢大佬IOT_TEAM一直在给我分享思路,才让我走到这一步,再说一句,
IOT_TEAM流弊。
复赛
复赛增加了一个判定条件,那就是找的环还要满足另一个条件(循环转账的前后路径的转账金额浮动,不能小于0.2,不能大于3),这里基本就是添加一个条件判断就好了。
整个复赛基本还是在进行一系列代码优化,因为改动不大,大家基本都是在考虑怎么减少时间,因为初赛我的方案使用的多线程是copy很多份函数来实现的,因此在复赛换成了数组的形式,代码精简到500行,而且时间还减少了。
另外之前开多线程是每当一个线程找完环之后就加锁、取一个ID、解锁,因为每次只取一个ID,这就导致加锁解锁次数太过频繁,这也会影响程序运行时间,所以在复赛将代码改成每次某个线程取ID的时候直接取连续的100个ID,这样就减少了加锁解锁的次数。
此外还将DFS换成了七层for循环。
另外还增加了各种剪枝操作,这里就不一一赘述了,太久了都忘了,大家想看的可以看代码。
复赛正式赛变更点
复赛正式赛主要有两个变更点:
- 转账金额还包括两位的小数,这个地方比较好解决,只需要把整数部分乘以100再加上小数部分就能解决。
- 环数增加到8层,也就是要找长度为3-8的环,这个也比较好解决,只需要增加一层for循环就好了。
那两个需求变更点都好解决,为啥我会5发WA呢。
因为写出了一个bug,在读取数据时,因为复赛添加的那个判定条件的缘故。当某个转账金额为0时,那么他一定不满足第二个判定条件,这时就可以不保存这个ID,因为他一定没法构成环。因此在读取数据时,一旦转账金额为0,我就会continue。到正赛的时候,恰恰因为这里出现了问题,因为有一些数据是0.几,比如0.23,因为我的操作会让0.23变成23,这个时候判断23不为0,所以会保存这个数据。然而bug就是我判断的不是最终的部分,即23,而是判断了整数部分,因为整数部分为0,所以直接continue了,实际上是不能continue的。
就因为这个bug,我始终以为是第二个需求点那里我写错了,一致在改那边,5次机会用完就结束了。
只能说是一种遗憾吧。
复赛代码
因为初赛代码太难看了,就放复赛代码就行了。
#include <bits/stdc++.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <mutex>
#include <sys/time.h>
using namespace std;
string test_data = "/data/test_data.txt";
string out_file = "/projects/student/result.txt";
const uint32_t THREAD_NUM = 4;
const uint32_t BLOCK = 100;
const uint32_t MAX_EDGE = 2000000;
const uint32_t MAX_NODE_NUM = 2000000; // 最大节点数量,即映射后的最大节点数
const uint32_t MAX_INDEGREE = 200; // P3第二维
const uint32_t MAX_P3_NUM = 15000; // P3第一维
struct record{ // txt文件读取后临时存放结构体,data1, data2, amount
uint32_t data1;
uint32_t data2;
uint64_t amount;
}Data[MAX_EDGE];
struct nextNode{ // 邻接表结构
uint32_t next;
uint64_t amount;
};
struct boundary{ // 邻接表边界
uint32_t l[THREAD_NUM];
uint32_t r;
};
struct P3Node{ // P3结构 end->mid->next->begin
uint32_t mid;
uint32_t next;
uint64_t amountBegin; // next->begin的金额
uint64_t amountEnd; // end->mid的金额
}P3[THREAD_NUM][MAX_P3_NUM][MAX_INDEGREE];
uint32_t reachable3[THREAD_NUM][3][MAX_NODE_NUM];
uint32_t P3Num[THREAD_NUM][MAX_P3_NUM];
struct blockInfo {
char *address[6];
uint32_t len[6];
} blocktoThread[MAX_NODE_NUM];
char idsComma[MAX_EDGE][20];
int ids[MAX_EDGE];
bitset<MAX_NODE_NUM> visited[THREAD_NUM];
uint32_t loop[(THREAD_NUM / 4 + 1) * 4];
boundary backwardGraphBound[MAX_EDGE];
boundary forwardGraphBound[MAX_EDGE];
nextNode backwardGraph[MAX_EDGE];
nextNode forwardGraph[MAX_EDGE];
uint32_t inDegree[MAX_EDGE];
uint32_t outDegree[MAX_EDGE];
uint32_t index2id[2*MAX_EDGE];
uint32_t nodeNum, pathNum, edgeNum;
uint32_t blockCount, curNode;
// uint32_t c[6][THREAD_NUM];
mutex mtx;
const unsigned int sizeTable[11] = { 9,99,999,9999,99999,999999,9999999,99999999,999999999, 4294967295};
const char digitOnes[] = {
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9',
'0','1','2','3','4','5','6','7','8','9'
};
const char digitTens[] = {
'0','0','0','0','0','0','0','0','0','0',
'1','1','1','1','1','1','1','1','1','1',
'2','2','2','2','2','2','2','2','2','2',
'3','3','3','3','3','3','3','3','3','3',
'4','4','4','4','4','4','4','4','4','4',
'5','5','5','5','5','5','5','5','5','5',
'6','6','6','6','6','6','6','6','6','6',
'7','7','7','7','7','7','7','7','7','7',
'8','8','8','8','8','8','8','8','8','8',
'9','9','9','9','9','9','9','9','9','9'
};
struct Path {
char l3[33 * 5000000];
char l4[44 * 5000000];
char l5[55 * 5000000];
char l6[66 * 5000000];
char l7[77 * 5000000];
char l8[88 * 5000000];
} Result[THREAD_NUM];
/*------------------------------------------------------------------*/
int uint2str(uint32_t num, char *buf)
{
uint32_t i = 0;
for (;; ++i) {
if (num <= sizeTable[i]) {
++i;
break;
}
}
uint32_t len = i;
uint32_t q, r;
while(num >= 65536) {
q = num / 100;
r = num - ((q << 6) + (q << 5) + (q << 2));
num = q;
buf[--i] = digitOnes[r];
buf[--i] = digitTens[r];
}
for (;;) {
q = (num * 52429) >> (16 + 3);
r = num - ((q << 3) + (q << 1));
buf[--i] = digitOnes[r];
num = q;
if (num == 0) break;
}
buf[len] = ',';
return len+1;
}
void addEdge(uint32_t u, uint32_t v, uint64_t m)
{
if (!v || u==v)
return;
backwardGraph[backwardGraphBound[u].r].next = v;
backwardGraph[backwardGraphBound[u].r].amount = m;
++backwardGraphBound[u].r;
}
bool cmp(nextNode a, nextNode b)
{
return a.next < b.next;
}
bool cmp2(P3Node a, P3Node b)
{
if(a.mid == b.mid) return a.next < b.next;
else return a.mid < b.mid;
}
void mmap_read()
{
int fd = open(test_data.c_str(), O_RDONLY);
int len = lseek(fd, 0, SEEK_END);
char *buf = (char *)mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);
// read data
uint32_t nodeIdx = 0;
while (*buf > '/') {
uint32_t data1 = 0, data2 = 0, decimal = 0;
uint64_t amount = 0;
while (*buf != ',') {
data1 = (data1<<3) + (data1<<1) + *buf - '0';
++buf;
}
++buf;
while (*buf != ',') {
data2 = (data2<<3) + (data2<<1) + *buf - '0';
++buf;
}
++buf;
// 读取整数部分
while (*buf != '.' && *buf > '/') {
amount = (amount<<3) + (amount<<1) + *buf - '0';
++buf;
}
if(*buf == '.'){
++buf;
}
// 读取小数部分
while (*buf > '/') {
decimal = (decimal<<3) + (decimal<<1) + *buf - '0';
++buf;
}
if (*buf == '\r')
++buf;
++buf;
if (amount == 0 && decimal == 0) continue; // 复赛B榜死在这了
Data[nodeIdx].data1 = data1;
Data[nodeIdx].data2 = data2;
Data[nodeIdx].amount = amount * 100 + decimal;
index2id[nodeIdx++] = data1;
++edgeNum;
}
// cout << Data[5039].amount << endl;
// 排序 + 构建id2index
unordered_map<uint32_t, uint32_t> id2index;
sort(index2id, index2id + nodeIdx); // 将index2id排序去重
nodeNum = unique(index2id, index2id + nodeIdx) - index2id;
id2index.reserve(nodeNum);
for (uint32_t idx = 0; idx < nodeNum; idx++) {
id2index[index2id[idx]] = idx + 1; //映射为从1开始的连续数值
ids[idx + 1] = uint2str(index2id[idx], idsComma[idx + 1]);
}
for (uint32_t idx = 0; idx < nodeIdx; idx++) {
Data[idx].data1 = id2index[Data[idx].data1];
Data[idx].data2 = id2index[Data[idx].data2];
++inDegree[Data[idx].data2];
++outDegree[Data[idx].data1];
}
backwardGraphBound[1].l[0] = 0;
forwardGraphBound[1].l[0] = 0;
backwardGraphBound[1].r = 0;
forwardGraphBound[1].r = 0;
for (uint32_t idx = 2; idx <= nodeNum; idx++) {
backwardGraphBound[idx].l[0] = backwardGraphBound[idx - 1].l[0] + outDegree[idx - 1];
backwardGraphBound[idx].r = backwardGraphBound[idx].l[0];
forwardGraphBound[idx].l[0] = forwardGraphBound[idx - 1].l[0] + inDegree[idx - 1];
forwardGraphBound[idx].r = forwardGraphBound[idx].l[0];
}
// 构造backwardGraph
for (uint32_t idx = 0, idx2 = 0; idx < nodeIdx; idx++) {
addEdge(Data[idx].data1, Data[idx].data2, Data[idx2++].amount);
}
blockCount = nodeNum / BLOCK;
}
void processing() {
uint32_t idx;
for (idx = 1; idx <= nodeNum; ++idx) {
sort(&backwardGraph[backwardGraphBound[idx].l[0]], &backwardGraph[backwardGraphBound[idx].r], cmp);
for (uint32_t idx2 = backwardGraphBound[idx].l[0]; idx2 < backwardGraphBound[idx].r; ++idx2) {
uint32_t &node = backwardGraph[idx2].next;
forwardGraph[forwardGraphBound[node].r].next = idx;
forwardGraph[forwardGraphBound[node].r].amount = backwardGraph[idx2].amount;
++forwardGraphBound[node].r;
}
}
for(uint32_t node =1;node<=nodeNum;node++){
for (uint32_t i = 1; i < THREAD_NUM; ++i) {
backwardGraphBound[node].l[i]=backwardGraphBound[node].l[0];
forwardGraphBound[node].l[i]=forwardGraphBound[node].l[0];
}
}
for(uint32_t i = 0; i < THREAD_NUM; i++) {
for(uint32_t j = 0; j < MAX_NODE_NUM; j++) {
reachable3[i][0][j] = -1;
}
}
}
uint32_t buildP3(uint32_t p0, uint32_t thread_id)
{
uint32_t it1, it2, it3, it4, p1, p2, p3, p4, add3, addressP3 = 0;
uint64_t val1, val2, val3, val4;
uint32_t num;
uint32_t idP3[MAX_NODE_NUM];
for(it1 = forwardGraphBound[p0].l[thread_id]; it1 < forwardGraphBound[p0].r; ++it1) {
p1 = forwardGraph[it1].next;
if(p1 < p0) {
forwardGraphBound[p0].l[thread_id]++;
continue;
}
val1 = forwardGraph[it1].amount;
for(it2 = forwardGraphBound[p1].l[thread_id]; it2 < forwardGraphBound[p1].r; ++it2) {
p2 = forwardGraph[it2].next;
if(p2 < p0) {
forwardGraphBound[p1].l[thread_id]++;
continue;
}
val2 = forwardGraph[it2].amount;
if(p2 == p0 || (uint64_t)5*val1 < val2 || val1 > (uint64_t)3*val2) continue;
for(it3 = forwardGraphBound[p2].l[thread_id]; it3 < forwardGraphBound[p2].r; ++it3) {
p3 = forwardGraph[it3].next;
if(p3 < p0) {
forwardGraphBound[p2].l[thread_id]++;
continue;
}
val3 = forwardGraph[it3].amount;
if(p3 == p1 || (uint64_t)5*val2 < val3 || val2 > (uint64_t)3*val3) continue;
if(reachable3[thread_id][0][p3] != p0) {
reachable3[thread_id][0][p3] = p0;
reachable3[thread_id][1][p3] = addressP3;
P3Num[thread_id][addressP3++] = 0;
}
add3 = reachable3[thread_id][1][p3];
P3[thread_id][add3][P3Num[thread_id][add3]].mid = p2;
P3[thread_id][add3][P3Num[thread_id][add3]].next = p1;
P3[thread_id][add3][P3Num[thread_id][add3]].amountBegin = val1;
P3[thread_id][add3][P3Num[thread_id][add3]++].amountEnd = val3;
for(it4 = forwardGraphBound[p3].l[thread_id]; it4 < forwardGraphBound[p3].r; ++it4) {
p4 = forwardGraph[it4].next;
// if(p4 < p0) {
// forwardGraphBound[p3].l[thread_id]++;
// continue;
// }
val4 = forwardGraph[it4].amount;
if(p4 == p1 || p4 == p2 || (uint64_t)5*val3 < val4 || val3 > (uint64_t)3*val4) continue;
if(reachable3[thread_id][2][p4] != p0) reachable3[thread_id][2][p4] = p0;
}
}
}
}
for(uint32_t m = 0; m < addressP3; m++) {
sort(P3[thread_id][m], P3[thread_id][m]+P3Num[thread_id][m], cmp2);
}
return addressP3;
}
void getAllCycle(uint32_t thread_id)
{
char *resultPath[6]{};
char *resultPathStart[6]{Result[thread_id].l3, Result[thread_id].l4, Result[thread_id].l5,
Result[thread_id].l6, Result[thread_id].l7, Result[thread_id].l8};
uint32_t p1;
uint32_t circntArray = 0;
uint32_t it1, it2, it3, it4, it5, it6, it7, it8;
uint64_t m123, m125, m233, m343, m453, m563;
char *loc;
while(true) {
mtx.lock();
uint32_t blockId = curNode++;
mtx.unlock();
if (blockId >= blockCount) {
loop[thread_id] = circntArray;
return;
}
uint32_t i, start = blockId * BLOCK + 1, end;
if (blockId == blockCount - 1)
end = nodeNum + 1;
else
end = (blockId + 1) * BLOCK + 1;
for (i = 0; i < 6; i++)
resultPath[i] = resultPathStart[i];
// 第一层
for(p1 = start; p1 < end; ++p1) {
if(!buildP3(p1, thread_id)) continue;
// 长度为3的环
/*----待写入----*/
if(reachable3[thread_id][0][p1] == p1) {
uint32_t &index = reachable3[thread_id][1][p1];
for(uint32_t n = 0; n < P3Num[thread_id][index]; n++) {
P3Node &temp = P3[thread_id][index][n];
if((uint64_t)5*temp.amountEnd < temp.amountBegin || temp.amountEnd > (uint64_t)3*temp.amountBegin)
continue;
memcpy(resultPath[0], idsComma[p1], ids[p1]);
resultPath[0] += ids[p1];
memcpy(resultPath[0], idsComma[temp.mid], ids[temp.mid]);
resultPath[0] += ids[temp.mid];
memcpy(resultPath[0], idsComma[temp.next], ids[temp.next]);
resultPath[0] += ids[temp.next];
*(resultPath[0] - 1) = '\n';
++circntArray;
// c[0][thread_id]++;
}
}
// 第二层
visited[thread_id][p1] = 1;
for(it2 = backwardGraphBound[p1].l[thread_id]; it2 < backwardGraphBound[p1].r; ++it2) {
uint32_t &p2 = backwardGraph[it2].next;
uint64_t &m12 = backwardGraph[it2].amount;
m123 = (uint64_t)m12 + ((uint64_t)m12 << 1);
m125 = (uint64_t)m12 + ((uint64_t)m12 << 2);
if(p2 < p1) {
backwardGraphBound[p1].l[thread_id]++;
continue;
}
// 长度为4的环
/*----待写入----*/
if(reachable3[thread_id][0][p2] == p1) {
uint32_t &index = reachable3[thread_id][1][p2];
for(uint32_t n = 0; n < P3Num[thread_id][index]; n++) {
P3Node &temp = P3[thread_id][index][n];
if(m12 > (uint64_t)5*temp.amountEnd || temp.amountEnd > m123 || temp.amountBegin > m125 || m12 > (uint64_t)3*temp.amountBegin)
continue;
memcpy(resultPath[1], idsComma[p1], ids[p1]);
resultPath[1] += ids[p1];
memcpy(resultPath[1], idsComma[p2], ids[p2]);
resultPath[1] += ids[p2];
memcpy(resultPath[1], idsComma[temp.mid], ids[temp.mid]);
resultPath[1] += ids[temp.mid];
memcpy(resultPath[1], idsComma[temp.next], ids[temp.next]);
resultPath[1] += ids[temp.next];
*(resultPath[1] - 1) = '\n';
++circntArray;
// c[1][thread_id]++;
}
}
// 第三层
visited[thread_id][p2] = 1;
for(it3 = backwardGraphBound[p2].l[thread_id]; it3 < backwardGraphBound[p2].r; ++it3) {
uint32_t &p3 = backwardGraph[it3].next;
uint64_t &m23 = backwardGraph[it3].amount;
m233 = (uint64_t)m23 + ((uint64_t)m23 << 1);
if(p3 < p1) {
backwardGraphBound[p2].l[thread_id]++;
continue;
}
if(visited[thread_id][p3] || m12 > (uint64_t)5*m23 || m23 > m123) continue;
// 长度为5的环
/*----待写入----*/
if(reachable3[thread_id][0][p3] == p1) {
uint32_t &index = reachable3[thread_id][1][p3];
for(uint32_t n = 0; n < P3Num[thread_id][index]; n++) {
P3Node &temp = P3[thread_id][index][n];
if(m23 > (uint64_t)5*temp.amountEnd || temp.amountEnd > m233 || temp.amountBegin > m125 || m12 > (uint64_t)3*temp.amountBegin)
continue;
if(visited[thread_id][temp.next] || visited[thread_id][temp.mid])
continue;
memcpy(resultPath[2], idsComma[p1], ids[p1]);
resultPath[2] += ids[p1];
memcpy(resultPath[2], idsComma[p2], ids[p2]);
resultPath[2] += ids[p2];
memcpy(resultPath[2], idsComma[p3], ids[p3]);
resultPath[2] += ids[p3];
memcpy(resultPath[2], idsComma[temp.mid], ids[temp.mid]);
resultPath[2] += ids[temp.mid];
memcpy(resultPath[2], idsComma[temp.next], ids[temp.next]);
resultPath[2] += ids[temp.next];
*(resultPath[2] - 1) = '\n';
++circntArray;
// c[2][thread_id]++;
}
}
// 第四层
visited[thread_id][p3] = 1;
for(it4 = backwardGraphBound[p3].l[thread_id]; it4 < backwardGraphBound[p3].r; ++it4) {
uint32_t &p4 = backwardGraph[it4].next;
uint64_t &m34 = backwardGraph[it4].amount;
m343 = (uint64_t)m34 + ((uint64_t)m34 << 1);
if(p4 < p1) {
backwardGraphBound[p3].l[thread_id]++;
continue;
}
if(visited[thread_id][p4] || m23 > (uint64_t)5*m34 || m34 > m233) continue;
// 长度为6的环
/*----待写入----*/
if(reachable3[thread_id][0][p4] == p1) {
uint32_t &index = reachable3[thread_id][1][p4];
for(uint32_t n = 0; n < P3Num[thread_id][index]; n++) {
P3Node &temp = P3[thread_id][index][n];
if(m34 > (uint64_t)5*temp.amountEnd || temp.amountEnd > m343 || temp.amountBegin > m125 || m12 > (uint64_t)3*temp.amountBegin)
continue;
if(visited[thread_id][temp.next] || visited[thread_id][temp.mid])
continue;
memcpy(resultPath[3], idsComma[p1], ids[p1]);
resultPath[3] += ids[p1];
memcpy(resultPath[3], idsComma[p2], ids[p2]);
resultPath[3] += ids[p2];
memcpy(resultPath[3], idsComma[p3], ids[p3]);
resultPath[3] += ids[p3];
memcpy(resultPath[3], idsComma[p4], ids[p4]);
resultPath[3] += ids[p4];
memcpy(resultPath[3], idsComma[temp.mid], ids[temp.mid]);
resultPath[3] += ids[temp.mid];
memcpy(resultPath[3], idsComma[temp.next], ids[temp.next]);
resultPath[3] += ids[temp.next];
*(resultPath[3] - 1) = '\n';
++circntArray;
// c[3][thread_id]++;
}
}
// 第五层
visited[thread_id][p4] = 1;
for(it5 = backwardGraphBound[p4].l[thread_id]; it5 < backwardGraphBound[p4].r; ++it5) {
uint32_t &p5 = backwardGraph[it5].next;
uint64_t &m45 = backwardGraph[it5].amount;
m453 = (uint64_t)m45 + ((uint64_t)m45 << 1);
if(p5 < p1) {
backwardGraphBound[p4].l[thread_id]++;
continue;
}
if(visited[thread_id][p5] || m34 > (uint64_t)5*m45 || m45 > m343) continue;
// 长度为7的环,先判断reachable3
/*----待写入----*/
if(reachable3[thread_id][0][p5] == p1) {
uint32_t &index = reachable3[thread_id][1][p5];
for(uint32_t n = 0; n < P3Num[thread_id][index]; n++) {
P3Node &temp = P3[thread_id][index][n];
if(m45 > (uint64_t)5*temp.amountEnd || temp.amountEnd > m453 || temp.amountBegin > m125 || m12 > (uint64_t)3*temp.amountBegin)
continue;
if(visited[thread_id][temp.next] || visited[thread_id][temp.mid])
continue;
memcpy(resultPath[4], idsComma[p1], ids[p1]);
resultPath[4] += ids[p1];
memcpy(resultPath[4], idsComma[p2], ids[p2]);
resultPath[4] += ids[p2];
memcpy(resultPath[4], idsComma[p3], ids[p3]);
resultPath[4] += ids[p3];
memcpy(resultPath[4], idsComma[p4], ids[p4]);
resultPath[4] += ids[p4];
memcpy(resultPath[4], idsComma[p5], ids[p5]);
resultPath[4] += ids[p5];
memcpy(resultPath[4], idsComma[temp.mid], ids[temp.mid]);
resultPath[4] += ids[temp.mid];
memcpy(resultPath[4], idsComma[temp.next], ids[temp.next]);
resultPath[4] += ids[temp.next];
*(resultPath[4] - 1) = '\n';
++circntArray;
// c[4][thread_id]++;
}
}
if(reachable3[thread_id][2][p5] != p1) continue;
// 第六层
visited[thread_id][p5] = 1;
for(it6 = backwardGraphBound[p5].l[thread_id]; it6 < backwardGraphBound[p5].r; ++it6) {
uint32_t &p6 = backwardGraph[it6].next;
uint64_t &m56 = backwardGraph[it6].amount;
m563 = (uint64_t)m56 + ((uint64_t)m56 << 1);
if(p6 < p1) {
backwardGraphBound[p5].l[thread_id]++;
continue;
}
if(reachable3[thread_id][0][p6] != p1) continue;
if(visited[thread_id][p6] || m45 > (uint64_t)5*m56 || m56 > (uint64_t)3*m45) continue;
uint32_t &index = reachable3[thread_id][1][p6];
for(uint32_t n = 0; n < P3Num[thread_id][index]; n++) {
P3Node &temp = P3[thread_id][index][n];
if(m56 > (uint64_t)5*temp.amountEnd || temp.amountEnd > m563 || temp.amountBegin > m125 || m12 > (uint64_t)3*temp.amountBegin)
continue;
if(visited[thread_id][temp.next] || visited[thread_id][temp.mid])
continue;
memcpy(resultPath[5], idsComma[p1], ids[p1]);
resultPath[5] += ids[p1];
memcpy(resultPath[5], idsComma[p2], ids[p2]);
resultPath[5] += ids[p2];
memcpy(resultPath[5], idsComma[p3], ids[p3]);
resultPath[5] += ids[p3];
memcpy(resultPath[5], idsComma[p4], ids[p4]);
resultPath[5] += ids[p4];
memcpy(resultPath[5], idsComma[p5], ids[p5]);
resultPath[5] += ids[p5];
memcpy(resultPath[5], idsComma[p6], ids[p6]);
resultPath[5] += ids[p6];
memcpy(resultPath[5], idsComma[temp.mid], ids[temp.mid]);
resultPath[5] += ids[temp.mid];
memcpy(resultPath[5], idsComma[temp.next], ids[temp.next]);
resultPath[5] += ids[temp.next];
*(resultPath[5] - 1) = '\n';
++circntArray;
// c[5][thread_id]++;
}
}
visited[thread_id][p5] = 0;
}
visited[thread_id][p4] = 0;
}
visited[thread_id][p3] = 0;
}
visited[thread_id][p2] = 0;
}
visited[thread_id][p1] = 0;
}
for (i = 0; i < 6; i++) {
blocktoThread[blockId].address[i] = resultPathStart[i];
blocktoThread[blockId].len[i] = resultPath[i] - resultPathStart[i];
resultPathStart[i] += (blocktoThread[blockId].len[i]);
}
}
}
static void f_write() {
for (uint32_t t = 0; t < THREAD_NUM; ++t) {
pathNum += loop[t];
}
char tmp[16];
int len = uint2str(pathNum, tmp);
tmp[len-1] = '\n';
int file_fd = open(out_file.c_str(), O_WRONLY | O_CREAT, 0666);
write(file_fd, tmp, len);
for (uint32_t k = 0; k < 6; ++k) {
for (uint32_t block = 0; block < blockCount; ++block) {
if (blocktoThread[block].len[k])
write(file_fd, blocktoThread[block].address[k], blocktoThread[block].len[k]);
}
}
}
int main() {
// auto start = std::chrono::high_resolution_clock::now();
// auto startTime = std::chrono::high_resolution_clock::now();
mmap_read();
// auto endTime = std::chrono::high_resolution_clock::now();
// cout << "read time: " << std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count() << "ms" << endl;
// startTime = std::chrono::high_resolution_clock::now();
processing();
// endTime = std::chrono::high_resolution_clock::now();
// cout << "processing time: " << std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count() << "ms" << endl;
// startTime = std::chrono::high_resolution_clock::now();
std::thread threads[THREAD_NUM - 1];
int t;
for (t = 0; t < THREAD_NUM - 1; t++) {
threads[t] = std::thread(getAllCycle, t);
}
getAllCycle(t);
for (t = 0; t < THREAD_NUM - 1; t++) {
threads[t].join();
}
// endTime = std::chrono::high_resolution_clock::now();
// cout << "getAllCycle time: " << std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count() << "ms" << endl;
// startTime = std::chrono::high_resolution_clock::now();
f_write();
// endTime = std::chrono::high_resolution_clock::now();
// cout << "write time: " << std::chrono::duration_cast<std::chrono::milliseconds>(endTime - startTime).count() << "ms" << endl;
// auto end = std::chrono::high_resolution_clock::now();
// cout << "total time: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << endl;
// uint32_t path[6] = {0};
// for(int i = 0; i < 6; i++) {
// for(int j = 0; j < THREAD_NUM; j++) {
// path[i] += c[i][j];
// }
// }
// cout << "3 loops: " << path[0] << endl;
// cout << "4 loops: " << path[1] << endl;
// cout << "5 loops: " << path[2] << endl;
// cout << "6 loops: " << path[3] << endl;
// cout << "7 loops: " << path[4] << endl;
// cout << "8 loops: " << path[5] << endl;
// cout << "result:" << pathNum << endl;
return 0;
}