C/C++ 高性能计算之多线程简单基础入门教程
比起别人的盲目罗列函数接口,鹦鹉学舌式的解释每一个输入参数和输出参数,一味求全而无重点,我的文章更侧重于入门知识的讲解,宁缺毋滥,只有一些最简单的入门用法,先亮代码,让你以最快地速度感受到这个东西原来是这个样子。更深入的知识,可以查看文档以及其他参考。
不管什么东西,都是尽可能地用新不用旧,所以,文章涉及的库和工具,都是基于 C++11 标准的。
前面有相关的博文已经介绍过 MPI,OpenMP 等相关的并行技术,以及 hadoop 生态圈相关的一些知识。今天给大家介绍一下 C/C++(下面统称为 C)中的利用多线程并行或者说并发进行计算的一个简单基础入门教程。
我这篇教程适合的人群为那些没接触过 C 的多线程编程,想要入门的同学,以及那些有程序基础,单纯的想要给自己的程序譬如循环部分,加加速,想要拷贝一份模板进行修改的人。
让我们开始吧。
从一个极简的例子开始
假设你现在写了一段 C 代码,大概长成下面这样。
using namespace std;
void fun(para_ref,para)
{
for (int j = 0; j < task_size; j++) {
dosomething
}
}
int main(int argc, char *argv[])
{
fun(para_ref,para);
return 0;
}
你哇的一下突然发现,你调用的这个函数里面有个 for 循环,这个 for 循环的每一次 j,似乎有没什么关系,各算各的,你也知道自己的电脑有四个核心,总不能浪费资源吧,你开始谋划并行的策略。你老板说不能用 OpenMp 的编译选项,MPI 的编程门槛有很高,然后你想到了似乎在哪里听到过的多线程。
一个最容易想到的就是把 for 循环给均分拆成几个部分,每个线程执行 for 中的一个部分。怎么做呢?非常简单,对于这种无需异步的需求,你像下面这样写就 OK 了。
#include <thread>//使用 c++11 标准的多线程库 thread
#define THREAD_NUM 4//定义多线程的个数
using namespace std;
void fun(para_ref, para, int index_begin, int task_num)
//原来的函数,被拆成了一片片,需要添加循环起始索引,和个数
{
for (int j = index_begin; j < index_begin + task_num; j++)
{
dosomething;
}
}
int main(int argc, char *argv[])
{
thread threads[THREAD_NUM];//定义4个线程
int task_num = (int)(task_size / THREAD_NUM);//计算每个线程负责的任务数
for (int i = 0; i < THREAD_NUM - 1; i++)
{
threads[i] = thread(fun, ref(para_ref), para, i*task_num, task_num);//每个线程开始做各自的任务
}
//因为 task_size 可能不能被平均分成 4 分,最后一份要特别处理
threads[THREAD_NUM - 1] = thread(func, ref(para_ref), para, (THREAD_NUM - 1)*task_num, int(task_num + (task_size% THREAD_NUM)));
for (auto& it : threads)
{
it.join();//4个线程同步
}
return 0;
}
代码中有较为详尽的解释,仔细看一遍,就知道 C++ 多线程是怎么用了。你可以把它当成也给模板,拷贝到自己的代码中,随便改改就能用了。简单解释一下
threads[i] = thread(fun, ref(para_ref), para, i*task_num, task_num);
这里的 thread 表示开始多线程的调用,其中 fun 是你每个线程要调用的回调函数,回调函数要求必须是静态函数,或者说是全局函数,如果你写在 C++ 的某个 Class 里面,可能需要处理的问题就比较多了,因为静态成员函数只能类的静态成员,牵一发而动全身。
fun 后面紧跟的是 fun 的输入参数,这个没什么好说的,需要注意的是,如果 fun 的输入参数有引用的类型的(C++ 中用 & 符号标识),那么多线程调用就要用 ref 关键字说清楚。如上所示。
任务池的方式实现并行的负载均衡(无需等待的 mutex 互斥锁设计)
上面提到的方法是利用循环变量的均匀划分来实现任务的划分,有个问题在于,这样划分真的是负载均衡的吗?有没有可能三个线程秒算完了,一个线程在还在一直算?这样是不是没有拉满 CPU,并不能达到最大性能?
举例来说,循环前 1/4 的占了 99% 的时间,那么这种多线程的任务分配方式就不尽合理。有人提出除 4 取余,余数相同的用同一个线程来做的方法,能做到负载均衡吗?也是看情况而定,假如刚好某个循环变量点的计算占了大部分时间,那么,4 个线程几乎同时算完的概率就很低。
那要怎么搞?一个基本的想法就是维护一个任务池。就像工厂里面的多人同时作业一样。把所有的任务丢到一个筐里面,每个工人都去筐那里认领任务,做完了手头闲了,接着去领任务,这样能够保证没有空闲劳动力,是最好的剥削工人方式。
这个思路听着好像很有道理,但是其实面临着一个非常严峻的的问题——任务争抢和读写脏数据。什么意思呢?我还是用工人的例子来做个假设。
假设有甲乙丙丁四个工人去图书馆拿书,书都是按 1、2、3、4、5……这种顺序编好号并分区放好按号来取的。那么,这时候最好需要在图书馆入口处放一本本子,来登记工人们取的书的编号,以免不知道取没取,就搞乱了。登记的思路是,比如甲乙丙丁各自拿了1、2、3、4四本书并登记好出去了,乙手脚麻利,很快回来,然后翻看一下本子,发现1、2、3、4都被取走了,这时候我取 5 吧,做了登记,就去图书馆领书了……以此类推。这时候,如果甲乙丙干完一个任务,同时回到了图书馆做登记,就一个本子啊,为了多干点活多挣点钱,他们就会开始抢这个本子,一会儿,他们就打起来了,这样就出问题了……怎么办,一个解决的方法,就是让他们排队登记,谁先踏入图书馆的门,就排前面,正在登记的人的本子“锁起来”,不让后面的人碰他,直到他写完,把锁打开。这个锁在 C++ 中有很多类型,比如说互斥锁(mutex),所谓的互斥,就是我在用,排斥你,你在用,就排斥我。学过数据库的就很清楚。
用互斥锁的思路是可以实现的。但是,新问题来了,让工人们排队等,岂不是白白浪费时间,本来一天可以拿 100 本书,现在排队浪费了时间,一天只能拿 90 本书,血亏啊。一般,对于 C++ 程序,如果对性能有很高的要求的话,多线程加锁血崩。那怎么办呢?别急,我有一个好办法。
就是把所有的任务编号全部写到本子上,工人取走一个任务,划掉一个任务。比如说你干完一个任务 5 回到图书馆了,然后看本子上的任务 6 被划掉了,接着看任务 7 也被划掉了,直到任务 8,还没被划掉,那么你就可以认领并划掉任务 8。这里的划掉这个动作,也可以用加锁来实现。这为什么能解决问题呢?譬如说两个工人同时回到了图书馆,然后,甲看到乙正在划任务 10(被上锁了),他不需要等你划完,他可以接着往下找,找到一个没被划掉的,然后取划掉它。这个过程,就不需要等待。不争,才是最好的状态,对于人生来说,又何尝不是如此。
怎么实现呢?同样,我提供一个写法的模板,看代码,永远是最好的学习程序的方式。
同样的,假如你现在手头有这样一份代码
#include<thread>
using namespace std;
void run_dfs()
{
for (int i = 0; i < task_num; ++i)
{
xxxxxxxxxxi
}
}
int main()
{
run_dfs();
exit(0);
}
你希望对 run_dfs 里面的 for 循环进行任务池式的并行。那么,你就可以这样子写:
#include<thread>
#include<vector>
using namespace std;
vector<mutex> mtx(task_num);
void dfs_thread(int thread_id)
{
int i = thread_id;
while (i<task_num)
{
if (mtx[i].try_lock())
{
xxxxxxxxxxi//do_task(i);
}
i++;
}
}
void run_dfs()
{
thread threads[4];
for (int thread_id = 0; thread_id < 4; thread_id++)
{
threads[thread_id] = thread(dfs_thread, thread_id);
}
for (auto& it : threads)
{
it.join();
}
}
int main()
{
run_dfs();
exit(0);
}
是不是很简洁?用了锁,却不需要等待。当然,这里的 while if 也是可以改成 for 的。解释一下,这个模板里面没有什么新的东西。一个是声明了一个互斥锁向量 mtx,用来记录某任务是否被上锁(即是否被工人划掉了)。这里只有用到一个方法就是互斥锁的try_lock()
,上锁成功返回 true,上锁失败返回 false,如果变量地址已经被锁住了,就会上锁失败,这个判断操作是很省时间的。
如果你的任务开始时有序的,做完任务之后还要求是有序的,即原来是按 1、2 、3……拍的任务,工人做好作品之后,需要按原来的顺序排好,那么在程序中就需要想办法记录每个工人认领了哪些任务号,或者说,每个任务依次是被那个任务认领的。
其他
- 多线程只要不写同一个内存地址,是不会引起冲突的,比如说四个线程分别写数组
int a[4]
的四个位置。 - 如果你问我你的程序计算结果相互依赖,依赖关系还很复杂,要怎么设计程序才能最大化 CPU 利用,同步的 wait 和异步的 feature ,了解一下。
- 线程跑完 join 和 detach 怎么理解?顾名思义呗。join 就像一条河流,走到了某个地方,在等待其他河流的加入,如果其他河流不加入(即没运行到这个位置),傲娇的河流就不往前流了。 detach 就是分离,单飞啦,不等你们,就像曹云金离开了郭德纲, detach 了,即使主线程德云社倒闭了,我照样无休止地飞奔下去,一直到东海。
最后,再附几个例子程序吧。
1、
#include <iostream>
#include <vector>
#include <sstream>
#include <fstream>
#include <cmath>
#include <cstdlib>
#include <numeric>
#include <string.h>
#include <thread>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/wait.h>//用于进程的监听
#include <ctime> //要包含的头文件
#define TEST
#define FEATURE_NUM 1000
#define THREAD_NUM 4
#define TRAIN_DATA_NUM 8000
#define TEST_DATA_NUM 20000
using namespace std;
//一些全局变量的定义
double features[TRAIN_DATA_NUM][FEATURE_NUM];
double features_test[TEST_DATA_NUM][FEATURE_NUM];
const int MAXS = 8 * (FEATURE_NUM + 1)*TRAIN_DATA_NUM;
char buf[MAXS];
const int MAXS_TEST = 8 * FEATURE_NUM*TEST_DATA_NUM;
char buf_test[MAXS_TEST];
int main(int argc, char *argv[])
{
//******************************函数申明**************************************
void loadTrainData2(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM]);
void loadTrainData(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM]);
void initParam(double wtInitV, double(*wtSet)[FEATURE_NUM]);
void train(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&wtSet)[FEATURE_NUM],
double stepSize0, double rho, double c, double epsilon);
void loadTestData2(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM]);
void loadTestData(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM]);
void predict(double(&features_test)[TEST_DATA_NUM][FEATURE_NUM], double(&wtSet)[FEATURE_NUM], string predictOutFile);
void loadAnswerData(string awFile, int(&awVec)[TEST_DATA_NUM]);
#ifdef TEST
double ts = clock();
#endif
//***************给定文件路径****************
/* string trainFile = "./data/train_data.txt";
string testFile = "./data/test_data2.txt";
string predictFile = "./projects/student/result.txt";
string answerFile = "./projects/student/answer2.txt"; */
string trainFile = "/data/train_data.txt";
string testFile = "/data/test_data2.txt";
string predictFile = "/projects/student/result.txt";
string answerFile = "/projects/student/answer2.txt";
//********************程序的多进程处理*********************
//创建共享变量和内存块,创建了 ID 为WT_SET的共享内存块,用来放权重
int shm;
double* wtSet_p;
double wtSet[FEATURE_NUM];
shm = shmget(IPC_PRIVATE, FEATURE_NUM*sizeof(double), IPC_CREAT | 0644);
if (shm < 0)
{
perror("shmget");
return 1;
}
//创建子进程
pid_t pid;
pid = fork();
if (pid == -1)
{
perror("fork");
exit(1);
}
else if (pid == 0) //子进程用来读训练数据和训练
{
// cout << "I'm parent, pid = " << getpid() << endl;
//******************给定训练参数***********
const double stepSize0 = 0.5;//初始步长
const double wtInitV = 0;//初始权重
const int maxIterTimes = 9;//最大迭代步数,不一定用到
double rho = 0.5;
double c = 0.01;
double epsilon = 0.01;
//*****************读入训练数据***************
cout << "loading train data ..." << endl;
#ifdef TEST
double ts11 = clock();
#endif
int labels[TRAIN_DATA_NUM];
loadTrainData2(trainFile, features, labels);
// cout << "size of train data:" << (*trainDataSet_p).size() << endl;
// cout << "label of train data:" << (*trainDataSet_p)[10].label << endl;
// cout << "feature of train data:" << (*trainDataSet_p)[10].features[10] << endl;
//getchar();
#ifdef TEST
double te11 = clock();
cout << "loading train data time is " << (double)(te11 - ts11) / CLOCKS_PER_SEC << "s" << endl;
#endif
//*********************开始训练************************
//初始化权重,并开始训练
cout << "training ..." << endl;
#ifdef TEST
double ts12 = clock();
#endif
initParam(wtInitV, &wtSet);//初始化参数
train(features, labels, wtSet, stepSize0, rho, c, epsilon);
//存模型
//storeModel(*wtSet_p);
#ifdef TEST
double te12 = clock();
cout << "train time is " << (double)(te12 - ts12) / CLOCKS_PER_SEC << "s" << endl;
#endif
//取共享内存空间,写入入共享数据权重
wtSet_p = (double*)shmat(shm, NULL, 0);
for (int i = 0; i < FEATURE_NUM; i++)
{
*(wtSet_p + i) = wtSet[i];
/* cout << wtSet[i]<<endl;
getchar(); */
}
shmdt(wtSet_p);//lost connect
#ifdef TEST
double tes = clock();
cout << "son process total time is******************** " << (double)(tes - ts) / CLOCKS_PER_SEC << "s" << endl;
#endif
}
else if (pid > 0) //父进程用来读取测试数据
{
// cout << "I'm child, pid = " << getpid() << endl;
cout << "loading test data ..." << endl;
#ifdef TEST
double ts21 = clock();
#endif
loadTestData2(testFile, features_test);//载入测试数据
#ifdef TEST
double te21 = clock();
cout << "loading test data time is " << (double)(te21 - ts21) / CLOCKS_PER_SEC << "s" << endl;
#endif
//预测,并保存数据
cout << "predicting ..." << endl;
wtSet_p = (double*)shmat(shm, NULL, 0);
/* if(int(wtSet_p) == -1){
perror("shmat");
return -1;
} */
#ifdef TEST
double tswait = clock();
#endif
int status;
wait(&status);
//等待子进程 训练结束
for (int i = 0; i < FEATURE_NUM; i++)
{
wtSet[i] = *(wtSet_p + i);//写入数据
/* cout << wtSet[i]<<endl;
getchar(); */
}
#ifdef TEST
double tewait = clock();
cout << "waiting time is " << (double)(tewait - tswait) / CLOCKS_PER_SEC << "s" << endl;
#endif
#ifdef TEST
double ts22 = clock();
#endif
predict(features_test, wtSet, predictFile);
#ifdef TEST
double te22 = clock();
cout << "predicting time is " << (double)(te22 - ts22) / CLOCKS_PER_SEC << "s" << endl;
#endif
#ifdef TEST
int answerVec[TEST_DATA_NUM];
//载入答案和保存预测值,计算误差率
cout << "calculating accuracy ..." << endl;
// cout << "loading answer data ..." << endl;
loadAnswerData(answerFile, answerVec);//载入标准答案
// cout << "after loading answer data ..." << endl;
int predictVec[TEST_DATA_NUM];
// cout << "loading predict data ..." << endl;
loadAnswerData(predictFile, predictVec);
int answerVec_size = sizeof(answerVec) / sizeof(int);
// std::cout << predictVec.size() << " " << answerVec.size() << endl;
int correctCount = 0;
// cout << "begin calculating accuracy ..." << endl;
for (int j = 0; j < TEST_DATA_NUM; j++)
{
if (j < answerVec_size)
{
if (answerVec[j] == predictVec[j])
{
correctCount++;
}
}
else
{
cout << "answer size less than the real predicted value" << endl;
}
}
double accurate = ((double)correctCount) / TEST_DATA_NUM;
//cout << "(double)correctCount: " << (double)correctCount << endl;
//cout << "(*answerVec_p).size(): " << answerVec.size() << endl;
cout << "the prediction accuracy is ******************** " << accurate << endl;
double tep = clock();
cout << "parent process total time is ******************** " << (double)(tep - ts) / CLOCKS_PER_SEC << "s" << endl;
#endif
}
return 0;
}
inline double Dot(double(&x)[FEATURE_NUM], double(&y)[FEATURE_NUM])
{
double temp = 0;
int i;
for (i = 0; i < FEATURE_NUM; i++)
{
temp = temp + x[i] * y[i];
}
return temp;
}
void initParam(double wtInitV, double(*wtSet)[FEATURE_NUM])
{
int i;
for (i = 0; i < FEATURE_NUM; i++)
{
(*wtSet)[i] = wtInitV;
}
}
void loadTrainData(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM])
{
#ifdef TEST
double ts111 = clock();
#endif
FILE *fp;
//int tick_end = TRAIN_DATA_NUM*(FEATURE_NUM + 1);
fp = fopen(trainFile.c_str(), "r");
if (fp == NULL)
{
printf("file error\n");
exit(1);
}
fseek(fp, 0L, SEEK_END);
int len = ftell(fp);//自动判断 fread 要读入的量
rewind(fp);//算完要读入的量后重置指针
const int maxS = len;//先读个 60 M
char* buf = new char[maxS];
len = fread(buf, 1, maxS, fp);//因为ftell回车算的是两个字符,所以这里要重置一下len
fclose(fp);
#ifdef TEST
double te111 = clock();
cout << "fread train data time is " << (double)(te111 - ts111) / CLOCKS_PER_SEC << "s" << endl;
double ts112 = clock();
#endif
for (int i = 0; i <= len; i++)
{
//把换行符号替换掉
if (buf[i] == '\n')
{
buf[i] = ',';
}
}
const char * split = ",";
char *p = strtok(buf, split);
double temp;
int row_ind = 0;
int col_ind = 0;
int tick = 0;
//double ftf;
while (p != NULL&&row_ind < TRAIN_DATA_NUM)
{
tick = tick + 1;
sscanf(p, "%lf", &temp);
if (tick % (FEATURE_NUM + 1) == 0)
{
labels[row_ind] = int(temp);
row_ind = row_ind + 1;
col_ind = 0;
p = strtok(NULL, split);
continue;
}
features[row_ind][col_ind] = temp;
col_ind = col_ind + 1;
p = strtok(NULL, split);
}
#ifdef TEST
double te112 = clock();
cout << "split train data time is " << (double)(te112 - ts112) / CLOCKS_PER_SEC << "s" << endl;
#endif
}
//char* buf = new char[MAXS];
void loadTrainData2(string trainFile, double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM])
{
#ifdef TEST
double ts111 = clock();
#endif
FILE *fp;
fp = fopen(trainFile.c_str(), "rb");
int len = fread(buf, 1, MAXS, fp);
fclose(fp);
#ifdef TEST
double te111 = clock();
cout << "fread train data time is " << (double)(te111 - ts111) / CLOCKS_PER_SEC << "s" << endl;
double ts112 = clock();
#endif
buf[len] = '\0';
double position = 1.0;
int symbol = 0;
int point = 0;
int i, col, row = 0;
double line[FEATURE_NUM + 2] = { 0 };
i = 0;
for (char *p = buf; *p && p - buf < len; p++)
{
if (row >= TRAIN_DATA_NUM)
{
break;
}
if (*p == ',')
{
if (symbol == 1)
line[i] = -line[i];
i++;
point = 0;
position = 1.0;
symbol = 0;
line[i] = 0.0;
}
else if (*p == '\n')
{
if (symbol == 1)
line[i] = -line[i];
int ftf = (int)line[FEATURE_NUM];
for (col = 0; col < FEATURE_NUM; col++)
{
features[row][col] = line[col];
}
labels[row] = ftf;
row = row + 1;
i = 0;
point = 0;
position = 1.0;
symbol = 0;
line[i] = 0.0;
}
else if (*p == '.')
{
point = 1;
}
else if (*p == '-')
{
symbol = 1;
}
else
{
if (point == 0)
line[i] = line[i] * 10 + *p - '0';
else
{
position = position*0.1;
line[i] += position*(*p - '0');
}
}
}
#ifdef TEST
double te112 = clock();
cout << "split train data time is " << (double)(te112 - ts112) / CLOCKS_PER_SEC << "s" << endl;
#endif
}
void wxbVecCalc(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], double(&w)[FEATURE_NUM], double wxbVec[TRAIN_DATA_NUM])
{
//just matrix multiplication
double temp;
int i, j;
for (i = 0; i < TRAIN_DATA_NUM; i++)
{
temp = 0;
for (j = 0; j < FEATURE_NUM; j++)
{
temp = temp + features[i][j] * w[j];
}
wxbVec[i] = temp;
}
}
void wxbVecCalc_test(double(&features)[TEST_DATA_NUM][FEATURE_NUM], double(&w)[FEATURE_NUM], double wxbVec[TEST_DATA_NUM])
{
//just matrix multiplication
double temp;
int i, j;
for (i = 0; i < TEST_DATA_NUM; i++)
{
temp = 0;
for (j = 0; j < FEATURE_NUM; j++)
{
temp = temp + features[i][j] * w[j];
}
wxbVec[i] = temp;
}
}
inline double sigmoidCalc(const double wxb)
{
double expv = exp(-1 * wxb);
double expvInv = 1 / (1 + expv);
return expvInv;
}
void sigmoidVecCalc(double(&wxbVec)[TRAIN_DATA_NUM], double(&sigmoidVec)[TRAIN_DATA_NUM])
{
int i;
for (i = 0; i < TRAIN_DATA_NUM; i++)
{
sigmoidVec[i] = sigmoidCalc(wxbVec[i]);
}
}
void sigmoidVecCalc_test(double(&wxbVec)[TEST_DATA_NUM], double(&sigmoidVec)[TEST_DATA_NUM])
{
int i;
for (i = 0; i < TEST_DATA_NUM; i++)
{
sigmoidVec[i] = sigmoidCalc(wxbVec[i]);
}
}
void wrxbmVecCalc(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], double(&labSigmoidVec)[TRAIN_DATA_NUM], double wrxbmVec[TRAIN_DATA_NUM])
{
//just reverse matrix multiplication
double temp=0.0L;
int i, j;
for (i = 0; i < FEATURE_NUM; i++)
{
temp = 0.0;
for (j = 0; j < TRAIN_DATA_NUM; j++)
{
temp = temp + features[j][i] * labSigmoidVec[j];
}
wrxbmVec[i] = temp/TRAIN_DATA_NUM;
}
}
void gradientVec(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&sigmoidVec)[TRAIN_DATA_NUM], double(&gVec)[FEATURE_NUM])
{
double labSigmoidVec[TRAIN_DATA_NUM];
for (int j = 0; j < TRAIN_DATA_NUM; j++)
{
labSigmoidVec[j] = double(labels[j])-sigmoidVec[j];
}
wrxbmVecCalc(features, labSigmoidVec, gVec);
}
double gradientSlope(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], int index, double(&sigmoidVec)[TRAIN_DATA_NUM])
{
//can still be modified
double gsV = 0.0L;
int i;
double sigv, label;
for (i = 0; i < TRAIN_DATA_NUM; i++)
{
sigv = sigmoidVec[i];
label = labels[i];
gsV += (label - sigv) * (features[i][index]);
}
// cout<<gsV<<endl;
// getchar();
gsV = gsV / TRAIN_DATA_NUM;
return gsV;
}
double lossCal(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&w)[FEATURE_NUM])
{
double lossV = 0.0L;
int i;
double wxbVec[TRAIN_DATA_NUM];
double sigmoidVec[TRAIN_DATA_NUM];
wxbVecCalc(features, w, wxbVec);
sigmoidVecCalc(wxbVec, sigmoidVec);
for (i = 0; i < TRAIN_DATA_NUM; i++)
{
lossV -= labels[i] * log(sigmoidVec[i]);
lossV -= (1 - labels[i]) * log(1 - sigmoidVec[i]);
}
lossV /= TRAIN_DATA_NUM;
return lossV;
}
void train(double(&features)[TRAIN_DATA_NUM][FEATURE_NUM], int(&labels)[TRAIN_DATA_NUM], double(&wtSet)[FEATURE_NUM],
double stepSize0, double rho, double c, double epsilon)
{
double dfk0_norm = 0;
double dfk1_norm = 0;
double beta;
double stepSize;
double f1, f2;
int i, j;
double p[FEATURE_NUM];
double dfk0[FEATURE_NUM];
double dfk1[FEATURE_NUM];
double sigmoidVec[TRAIN_DATA_NUM];
double wTry[FEATURE_NUM];
double wxbVec[TRAIN_DATA_NUM];
//计算 Sigmoid(Aw)
wxbVecCalc(features, wtSet, wxbVec);
sigmoidVecCalc(wxbVec, sigmoidVec);
//cout << "debug 1" << sigmoidVec[10] << endl;
//计算 p 和 grad f的norm
gradientVec(features, labels, sigmoidVec, dfk0);
dfk0_norm = Dot(dfk0, dfk0);
for (i = 0; i < FEATURE_NUM; i++)
{
p[i] = dfk0[i];
}
//cout << "debug 2" << p[10] << "dfk0_norm" << dfk0_norm << endl;
#ifdef TEST
int iter_num = 0;
#endif
//开始进入算法的循环
//for (i = 0; i < maxIterTimes; i++) {
while (dfk0_norm > epsilon)
{
#ifdef TEST
iter_num = iter_num + 1;
#endif
// cout << "dfk0_norm:"<<dfk0_norm << endl;
stepSize = stepSize0;
//用回退方法计算步长
///*
double lossV = 0.0L;
for (i = 0; i < TRAIN_DATA_NUM; i++)
{
lossV -= labels[i] * log(sigmoidVec[i]);
lossV -= (1 - labels[i]) * log(1 - sigmoidVec[i]);
}
lossV /= TRAIN_DATA_NUM;
f2 = lossV - c*stepSize*Dot(dfk0, p);
// while (true)
for (;;)
{
// cout << "stepsize%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%:" << stepSize << endl;
// cout<<"stepsize:"<<stepSize<<endl;
//计算 w+alpha*p_k
// cout<<"stepsize:"<<stepSize<<endl;
for (j = 0; j < FEATURE_NUM; j++)
{
wTry[j] = wtSet[j] + stepSize * p[j];
}
// cout << "debug 3" << wTry[10] << endl;
// cout << "debug LL" << wtSet[10] << " " << stepSize << " " << p[10] << endl;
f1 = lossCal(features, labels, wTry);
// cout << "f1:"<<f1<<endl<<"f2:"<< f2 << endl;
//cout<<"debug L"<<lossCal()<<" "<<c<<" "<< stepSize<<" " <<dfk0[10]<<" "<<p[10]<<endl;
// cout << "debug 4" << "f1:" << f1 << "f2:" << f2 << endl;
if (f1 <= f2 || stepSize < 0.001)
{
break;
}
stepSize = rho*stepSize;
// cout << "stepSize" << stepSize << endl;
}
//*/
//stepSize = rho*stepSize0;
//先走一步共轭方向步
for (j = 0; j < FEATURE_NUM; j++)
{
wtSet[j] += stepSize * p[j];
}
//cout << "debug 5" << .wtSet[10] << endl;
//更新 beta 和 p
//算下一步梯度以及梯度模
//计算 Sigmoid(Aw)
wxbVecCalc(features, wtSet, wxbVec);
sigmoidVecCalc(wxbVec, sigmoidVec);
// cout<<"param.wtSet"<<param.wtSet[3]<<endl;
gradientVec(features, labels, sigmoidVec, dfk1);
dfk1_norm = Dot(dfk1, dfk1);
// cout << "debug 6:" << endl << sigmoidVec[10] << dfk1[10] << dfk1_norm << endl;
// cout<<"dfk1_norm"<<dfk1_norm<<endl;
//算 beta
beta = dfk1_norm / dfk0_norm;
//cout << beta << endl;
// cout << "debug 7:" << beta << endl;
// cout<<"p"<<p[10]<<endl;
//更新一下 p
for (j = 0; j < FEATURE_NUM; j++)
{
p[j] = dfk1[j] + beta*p[j];
}
// cout << "debug 8:" << p[10] << endl;
dfk0_norm = dfk1_norm;
for (i = 0; i < FEATURE_NUM; i++)
{
dfk0[i] = dfk1[i];
}
// cout << "dfk1_norm:"<<dfk1_norm << endl;
}
#ifdef TEST
cout << "train total iter num: " << iter_num << endl;
cout << "dfk0_norm: " << dfk0_norm << endl;
#endif
}
void loadAnswerData(string awFile, int(&awVec)[TEST_DATA_NUM])
{
// vector<int> awVec;
int k = 0;
ifstream infile(awFile.c_str());
if (!infile)
{
cout << "打开答案文件失败" << endl;
exit(0);
}
while (infile)
{
string line;
int aw;
getline(infile, line);
if (line.size() > 0 && k <= TEST_DATA_NUM)
{
stringstream sin(line);
sin >> aw;
awVec[k] = aw;
k = k + 1;
}
}
infile.close();
}
void loadTestData(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM])
{
FILE *fp;
fp = fopen(testFile.c_str(), "r");
if (fp == NULL)
{
printf("file error\n");
exit(1);
}
fseek(fp, 0L, SEEK_END);
int len = ftell(fp);//自动判断 fread 要读入的量
rewind(fp);//算完要读入的量后重置指针
const int maxS = len;//先读个 60 M
char* buf = new char[maxS];
len = fread(buf, 1, maxS, fp);//因为ftell回车算的是两个字符,所以这里要重置一下len
fclose(fp);
for (int i = 0; i <= len; i++)
{
//把换行符号替换掉
if (buf[i] == '\n')
{
buf[i] = ',';
}
}
const char * split = ",";
char *p = strtok(buf, split);
double temp;
int row_ind = 0;
int col_ind = 0;
int tick = 0;
//int ftf;
//double ftf;
while (p != NULL&&row_ind < TEST_DATA_NUM)
{
tick = tick + 1;
sscanf(p, "%lf", &temp);
features[row_ind][col_ind] = temp;
col_ind = col_ind + 1;
if (tick % (FEATURE_NUM) == 0)
{
row_ind = row_ind + 1;
col_ind = 0;
}
p = strtok(NULL, split);
}
}
void loadTestData2(string testFile, double(&features)[TEST_DATA_NUM][FEATURE_NUM])
{
#ifdef TEST
double ts111 = clock();
#endif
FILE *fp;
fp = fopen(testFile.c_str(), "rb");
int len = fread(buf_test, 1, MAXS_TEST, fp);
#ifdef TEST
double te111 = clock();
cout << "fread test data time is " << (double)(te111 - ts111) / CLOCKS_PER_SEC << "s" << endl;
double ts112 = clock();
#endif
buf_test[len] = '\0';
double position = 1.0;
int symbol = 0;
int point = 0;
int i, col, row = 0;
double line[FEATURE_NUM + 1] = { 0.0 };
i = 0;
for (char *p = buf_test; *p && p - buf_test < len; p++)
{
if (row >= TEST_DATA_NUM)
{
break;
}
if (*p == ',')
{
if (symbol == 1)
line[i] = -line[i];
i++;
point = 0;
position = 1.0;
symbol = 0;
line[i] = 0.0;
}
else if (*p == '\n')
{
if (symbol == 1)
line[i] = -line[i];
for (col = 0; col < FEATURE_NUM; col++)
{
features[row][col] = line[col];
}
row = row + 1;
i = 0;
point = 0;
position = 1.0;
symbol = 0;
line[i] = 0.0;
}
else if (*p == '.')
{
point = 1;
}
else if (*p == '-')
{
symbol = 1;
}
else
{
if (point == 0)
line[i] = line[i] * 10 + *p - '0';
else
{
position = position*0.1;
line[i] += position*(*p - '0');
}
}
}
fclose(fp);
#ifdef TEST
double te112 = clock();
cout << "split test data time is " << (double)(te112 - ts112) / CLOCKS_PER_SEC << "s" << endl;
#endif
}
int storePredict(int(&predict)[TEST_DATA_NUM], string predictOutFile)
{
string line;
int i;
ofstream fout(predictOutFile.c_str());
if (!fout.is_open())
{
cout << "打开预测结果文件失败" << endl;
}
for (i = 0; i < TEST_DATA_NUM; i++)
{
fout << predict[i] << endl;
}
fout.close();
return 0;
}
void predict(double(&features_test)[TEST_DATA_NUM][FEATURE_NUM], double(&wtSet)[FEATURE_NUM], string predictOutFile)
{
double sigVal;
int predictVal;
int predictVec[TEST_DATA_NUM];
double predictTrueThresh = 0.5;
double wxbVec[TEST_DATA_NUM];
double sVec[TEST_DATA_NUM];
wxbVecCalc_test(features_test, wtSet, wxbVec);
sigmoidVecCalc_test(wxbVec, sVec);
for (int j = 0; j < TEST_DATA_NUM; j++)
{
sigVal = sVec[j];
predictVal = sigVal >= predictTrueThresh ? 1 : 0;
predictVec[j] = predictVal;
}
storePredict(predictVec, predictOutFile);
}
2、
#include <map>
#include <list>
#include <set>
#include <stack>
#include <unordered_map>
#include <algorithm>
#include <iostream>
#include <stdio.h>
#include <chrono>
#include <vector>
#include <array>
#include <tuple>
#include <chrono>
#include <string.h>
#include <thread>
using namespace std;
typedef vector<vector<int>> Graph;
const int maxRecN = 280000;
const int maxCycleN = 3000000;
const int MAXS_R = maxCycleN * 80;
char a[MAXS_R];
char L3[MAXS_R];
char L4[MAXS_R];
char L5[MAXS_R];
char L6[MAXS_R];
char L7[MAXS_R];
int ID[maxRecN][2];
const int MAXS = maxRecN * 30;
char buf[MAXS];
int readTestData(string testDataPath, Graph &gTestData, vector<int> &idxMap, \
Graph &pathL2SmHead, \
Graph &pathL2SmTail)
{
int i = 0, commaN = 0;
FILE *fp;
fp = fopen(testDataPath.c_str(), "rb");
int len = fread(buf, 1, MAXS, fp);
buf[len] = '\0';
ID[0][0] = 0;
ID[0][1] = 0;
set<int> vertexSet;
unordered_map<int, int> idxMapRev;
for (char *p = buf; *p&&p - buf < len; p++)
{
if (*p == '\n')
{
++i;
ID[i][0] = 0;
ID[i][1] = 0;
commaN = 0;
}
else if (*p == ',')
{
vertexSet.insert(ID[i][commaN]);
commaN++;
}
else if (commaN < 2)
{
ID[i][commaN] = ID[i][commaN] * 10 + *p - '0';
}
}
int index = 0;
for (auto &it : vertexSet)
{
idxMap.emplace_back(it);
idxMapRev.insert(make_pair(it, index));
++index;
}
int from, to;
gTestData.resize(index, vector<int>());
pathL2SmHead.resize(index, vector<int>());
pathL2SmTail.resize(index, vector<int>());
for (int j = 0; j < i; ++j)
{
from = idxMapRev[ID[j][0]];
to = idxMapRev[ID[j][1]];
if (from != to)//no loop,some off line data have loop
{
gTestData[from].emplace_back(to);
if (from < to)
pathL2SmHead[from].emplace_back(to);
else
pathL2SmTail[from].emplace_back(to);
}
}
for (auto &it : gTestData)
sort(it.begin(), it.end());//sort for arranged output result
for (auto &it : pathL2SmHead)
sort(it.begin(), it.end());//sort for arranged output result
for (auto &it : pathL2SmTail)
sort(it.begin(), it.end());//sort for arranged output result
return i;
}
int getDigits(int v) {
if (v < 10) return 1;
if (v < 100) return 2;
if (v < 1000) return 3;
if (v < 1000000000000) { // 10^12
if (v < 100000000) { // 10^7
if (v < 1000000) { // 10^6
if (v < 10000) return 4;
return 5 + (v >= 100000); // 10^5
}
return 7 + (v >= 10000000); // 10^7
}
if (v < 10000000000) { // 10^10
return 9 + (v >= 1000000000); // 10^9
}
return 11 + (v >= 100000000000); // 10^11
}
return 12 + getDigits(v / 1000000000000); // 10^12
}
int intToAscii(int value, char* dst) {
static const char digits[] = {
"0001020304050607080910111213141516171819"
"2021222324252627282930313233343536373839"
"4041424344454647484950515253545556575859"
"6061626364656667686970717273747576777879"
"8081828384858687888990919293949596979899" };
const int length = getDigits(value);
int next = length - 1;
while (value >= 100) {
const int i = (value % 100) * 2;
value /= 100;
dst[next - 1] = digits[i];
dst[next] = digits[i + 1];
next -= 2;
}
// Handle last 1-2 digits
if (value < 10) {
dst[next] = '0' + int(value);
}
else {
int i = int(value) * 2;
dst[next - 1] = digits[i];
dst[next] = digits[i + 1];
}
return length;
}
void arr2Ascii3(array<int, 3> &arr, char* &p, vector<int> &idxMap)
{
int tmp;
int dig;
dig = intToAscii(idxMap[arr[0]], p);
p += dig;
for (int i = 1; i < 3; ++i)
{
*p++ = ',';
dig = intToAscii(idxMap[arr[i]], p);
p += dig;
}
*p++ = '\n';
}
void arr2Ascii4(array<int, 4> &arr, char* &p, vector<int> &idxMap)
{
int tmp;
int dig;
dig = intToAscii(idxMap[arr[0]], p);
p += dig;
for (int i = 1; i < 4; ++i)
{
*p++ = ',';
dig = intToAscii(idxMap[arr[i]], p);
p += dig;
}
*p++ = '\n';
}
void arr2Ascii5(array<int, 5> &arr, char* &p, vector<int> &idxMap)
{
int tmp;
int dig;
dig = intToAscii(idxMap[arr[0]], p);
p += dig;
for (int i = 1; i < 5; ++i)
{
*p++ = ',';
dig = intToAscii(idxMap[arr[i]], p);
p += dig;
}
*p++ = '\n';
}
void arr2Ascii6(array<int, 6> &arr, char* &p, vector<int> &idxMap)
{
int tmp;
int dig;
dig = intToAscii(idxMap[arr[0]], p);
p += dig;
for (int i = 1; i < 6; ++i)
{
*p++ = ',';
dig = intToAscii(idxMap[arr[i]], p);
p += dig;
}
*p++ = '\n';
}
void arr2Ascii7(array<int, 7> &arr, char* &p, vector<int> &idxMap)
{
int tmp;
int dig;
dig = intToAscii(idxMap[arr[0]], p);
p += dig;
for (int i = 1; i < 7; ++i)
{
*p++ = ',';
dig = intToAscii(idxMap[arr[i]], p);
p += dig;
}
*p++ = '\n';
}
void writeResultData(string resultDataPath, char* &p3, char* &p4, char* &p5, char* p6, char* p7, int cycleCount)
{
int sizeL3 = p3 - L3;
int sizeL4 = p4 - L4;
int sizeL5 = p5 - L5;
int sizeL6 = p6 - L6;
int sizeL7 = p7 - L7;
int totalSize = sizeL3 + sizeL4 + sizeL5 + sizeL6 + sizeL7;
memcpy(p3, L4, sizeL4);
p3 += sizeL4;
memcpy(p3, L5, sizeL5);
p3 += sizeL5;
memcpy(p3, L6, sizeL6);
p3 += sizeL6;
memcpy(p3, L7, sizeL7);
#ifdef TEST
auto t1 = chrono::steady_clock::now();
#endif
//string L3Path = "./L3.txt";
FILE *pL3 = fopen(resultDataPath.c_str(), "w+");
fprintf(pL3, "%d\n", cycleCount);
//fprintf(pL3, "%s", L3);
fwrite(L3, 1, totalSize, pL3);
fclose(pL3);
#ifdef TEST
auto t2 = chrono::steady_clock::now();
cout << "fwrite time: " << chrono::duration_cast<chrono::duration<double>>(t2 - t1).count() << " s" << endl;
#endif
}
void L3whole(Graph &gTestData, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3)
{
for (int h = 0; h < vtxNum; h++)
{
for (auto &h0 : gTestData[h])
{
for (auto &t0 : gTestData[h0])
{
if (h != t0)
{
arrL2[0] = h0;
arrL2[1] = t0;
pathL3[h].emplace_back(arrL2);
}
}
}
}
}
void L3head(Graph &pathL2SmHead, Graph &gTestData, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3SmHead)
{
for (int h = 0; h < vtxNum; h++)
{
for (auto &h0 : pathL2SmHead[h])
{
for (auto &t0 : gTestData[h0])
{
if (h < t0)
{
arrL2[0] = h0;
arrL2[1] = t0;
pathL3SmHead[h].emplace_back(arrL2);
}
}
}
}
}
void L3tail(Graph &gTestData, Graph &pathL2SmTail, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3SmTail)
{
for (int h = 0; h < vtxNum; h++)
{
for (auto &h0 : gTestData[h])
{
for (auto &t0 : pathL2SmTail[h0])
{
if (t0 < h)
{
arrL2[0] = h0;
arrL2[1] = t0;
pathL3SmTail[h].emplace_back(arrL2);
}
}
}
}
}
void L3tailIdx(Graph &gTestData, Graph &pathL2SmTail, int &vtxNum, array<int, 2> arrL2, vector<vector<array<int, 2>>> &pathL3SmTailIdx)
{
for (int h = 0; h < vtxNum; h++)
{
for (auto &h0 : gTestData[h])
{
for (auto &t0 : pathL2SmTail[h0])
{
if (t0 < h)
{
arrL2[0] = h;
arrL2[1] = h0;
pathL3SmTailIdx[t0].emplace_back(arrL2);
}
}
}
}
}
void L4head(vector<vector<array<int, 2>>> &pathL3SmHead, Graph &gTestData, int &vtxNum, array<int, 3> arrL3, vector<vector<array<int, 3>>> &pathL4SmHead)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL3SmHead[h])
{
//int h0 = it[0];
//int h1 = it[1];
for (auto &ite : gTestData[it[1]])
{
int t0 = ite;
if (h < t0&&it[0] != t0)//no self loop
{
arrL3[0] = it[0];
arrL3[1] = it[1];
arrL3[2] = t0;
pathL4SmHead[h].emplace_back(arrL3);
}
}
}
}
}
void L4tailIdx(Graph &gTestData, vector<vector<array<int, 2>>> &pathL3SmTail, int &vtxNum, array<int, 3> arrL3, vector<vector<array<int, 3>>> &pathL4SmTailIdx)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : gTestData[h])
{
//int h0 = it;
for (auto &ite : pathL3SmTail[it])
{
//t0 = ite[0];
//t1 = ite[1];
if (h != ite[0] && ite[1] < h)//no self loop and last one is smallest
{
arrL3[0] = h;
arrL3[1] = it;
arrL3[2] = ite[0];
pathL4SmTailIdx[ite[1]].emplace_back(arrL3);
}
}
}
}
}
void L3cycle(Graph &pathL2SmHead, vector<vector<array<int, 2>>> &pathL3SmTailIdx, int &vtxNum, array<int, 3> arrL3, char* &p3, vector<int> &idxMap, int &cycleCount)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL2SmHead[h])
{
//h0 = it;
for (auto &ite : pathL3SmTailIdx[h])
{
//t0 = ite[0];
//t1 = ite[1];
if (it == ite[0])//no self loop
{
arrL3[0] = h;
arrL3[1] = it;
arrL3[2] = ite[1];
arr2Ascii3(arrL3, p3, idxMap);
cycleCount++;
}
}
}
}
}
void L5tailIdx(vector<vector<array<int, 2>>> &pathL3, vector<vector<array<int, 2>>> &pathL3SmTail, int &vtxNum, array<int, 4> arrL4, vector<vector<array<int, 4>>> &pathL5SmTailIdx)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL3[h])
{
int h0 = it[0];
int h1 = it[1];
for (auto &ite : pathL3SmTail[h1])
{
//t0 = ite[0];
//t1 = ite[1];
if (ite[1] < h&&ite[1] < h0&&h != ite[0] && h0 != ite[0])//small tail and no self loop
{
arrL4[0] = h;
arrL4[1] = h0;
arrL4[2] = h1;
arrL4[3] = ite[0];
pathL5SmTailIdx[ite[1]].emplace_back(arrL4);
}
}
}
}
}
void L4cycle(vector<vector<array<int, 2>>> &pathL3SmHead, vector<vector<array<int, 2>>> &pathL3SmTailIdx, int &vtxNum, array<int, 4> arrL4, char* &p4, vector<int> &idxMap, int &cycleCount)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL3SmHead[h])
{
int h0 = it[0];
int h1 = it[1];
for (auto &ite : pathL3SmTailIdx[h])
{
//t0 = ite[0];
//t1 = ite[1];
if (ite[0] == h1&&h0 != ite[1])//small tail and no self loop
{
arrL4[0] = h;
arrL4[1] = h0;
arrL4[2] = h1;
arrL4[3] = ite[1];
arr2Ascii4(arrL4, p4, idxMap);
cycleCount++;
}
}
}
}
}
void L5cycle(vector<vector<array<int, 3>>> &pathL4SmHead, vector<vector<array<int, 2>>> &pathL3SmTailIdx, int &vtxNum, array<int, 5> arrL5, char* &p5, vector<int> &idxMap, int &cycleCount)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL4SmHead[h])
{
int h0 = it[0];
int h1 = it[1];
int h2 = it[2];
for (auto &ite : pathL3SmTailIdx[h])
{
// t0 = ite[0];
// t1 = ite[1];
if (h2 == ite[0] && h0 != ite[1] && h1 != ite[1])//cycle and no self loop
{
arrL5[0] = h;
arrL5[1] = h0;
arrL5[2] = h1;
arrL5[3] = h2;
arrL5[4] = ite[1];
arr2Ascii5(arrL5, p5, idxMap);
cycleCount++;
}
}
}
}
}
void L6cycle(vector<vector<array<int, 3>>> &pathL4SmHead, vector<vector<array<int, 3>>> &pathL4SmTailIdx, int &vtxNum, array<int, 6> arrL6, char* &p6, vector<int> &idxMap, int &cycleCount)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL4SmHead[h])
{
int h0 = it[0];
int h1 = it[1];
int h2 = it[2];
for (auto &ite : pathL4SmTailIdx[h])
{
// t0 = ite[0];
// t1 = ite[1];
// t2 = ite[2];
if (h2 == ite[0] && h0 != ite[1] && h0 != ite[2] && h1 != ite[1] && h1 != ite[2])
{
arrL6[0] = h;
arrL6[1] = h0;
arrL6[2] = h1;
arrL6[3] = h2;
arrL6[4] = ite[1];
arrL6[5] = ite[2];
arr2Ascii6(arrL6, p6, idxMap);
cycleCount++;
}
}
}
}
}
void L7cycle(vector<vector<array<int, 3>>> &pathL4SmHead, vector<vector<array<int, 4>>> &pathL5SmTailIdx, int &vtxNum, array<int, 7> arrL7, char* &p7, vector<int> &idxMap, int &cycleCount)
{
for (int h = 0; h < vtxNum; ++h)
{
for (auto &it : pathL4SmHead[h])
{
int h0 = it[0];
int h1 = it[1];
int h2 = it[2];
for (auto &ite : pathL5SmTailIdx[h])
{
//t0 = ite[0];
// t1 = ite[1];
//t2 = ite[2];
// t3 = ite[3];
if (!(h2^ite[0])\
&&h0 != ite[1] && h1 != ite[1]\
&&h0 != ite[2] && h1 != ite[2]\
&&h0 != ite[3] && h1 != ite[3]\
)
{
arrL7[0] = h;
arrL7[1] = h0;
arrL7[2] = h1;
arrL7[3] = h2;
arrL7[4] = ite[1];
arrL7[5] = ite[2];
arrL7[6] = ite[3];
arr2Ascii7(arrL7, p7, idxMap);
cycleCount++;
}
}
}
}
}
int main()
{
#ifdef TEST
auto s1 = chrono::steady_clock::now();
#endif
string testDataPath = "/data/test_data.txt";
string resultDataPath = "/projects/student/result.txt";
#ifdef TEST
testDataPath = "./data/1004812/test_data.txt";
resultDataPath = "./data/myresult.txt";
#endif
char* p3 = L3;
char* p4 = L4;
char* p5 = L5;
char* p6 = L6;
char* p7 = L7;
int cycleCount = 0;
int cycleCount3 = 0;
int cycleCount4 = 0;
int cycleCount5 = 0;
int cycleCount6 = 0;
int cycleCount7 = 0;
Graph gTestData, pathL2SmHead, pathL2SmTail;
vector<int> idxMap;
int edgeNum = readTestData(testDataPath, gTestData, idxMap, pathL2SmHead, pathL2SmTail);
int vtxNum = idxMap.size();
vector<vector<array<int, 2>>> pathL3(vtxNum);
vector<vector<array<int, 2>>> pathL3SmHead(vtxNum);
vector<vector<array<int, 2>>> pathL3SmTail(vtxNum);
vector<vector<array<int, 2>>> pathL3SmTailIdx(vtxNum);
vector<vector<array<int, 3>>> pathL4(vtxNum);
vector<vector<array<int, 3>>> pathL4SmHead(vtxNum);
vector<vector<array<int, 3>>> pathL4SmTailIdx(vtxNum);
vector<vector<array<int, 4>>> pathL5(vtxNum);
vector<vector<array<int, 4>>> pathL5SmTailIdx(vtxNum);
array<int, 2> arrL2;
array<int, 3> arrL3;
array<int, 4> arrL4;
array<int, 5> arrL5;
array<int, 6> arrL6;
array<int, 7> arrL7;
thread threads[4];
//3whole=2whole+2whole
//L3whole(gTestData, vtxNum, arrL2, pathL3);
threads[0] = thread(L3whole, ref(gTestData), ref(vtxNum),(arrL2),ref(pathL3));
//threads[0].join();
//3head=2head+2whole
//L3head(pathL2SmHead, gTestData, vtxNum, arrL2, pathL3SmHead);
threads[1] = thread(L3head, ref(pathL2SmHead), ref(gTestData),ref(vtxNum),(arrL2),ref(pathL3SmHead));
//3tail=2whole+2tail
//L3tailIdx(gTestData, pathL2SmTail, vtxNum, arrL2, pathL3SmTailIdx);
threads[2] = thread(L3tailIdx, ref(gTestData), ref(pathL2SmTail),ref(vtxNum),(arrL2),ref(pathL3SmTailIdx));
//L3tail(gTestData, pathL2SmTail, vtxNum, arrL2, pathL3SmTail);
threads[3] = thread(L3tail, ref(gTestData), ref(pathL2SmTail),ref(vtxNum),(arrL2),ref(pathL3SmTail));
for (auto& it : threads) {
it.join();
}
thread threads2[5];
//4head=3head+2whole
//L4head(pathL3SmHead, gTestData, vtxNum, arrL3, pathL4SmHead);
threads2[0] = thread(L4head, ref(pathL3SmHead), ref(gTestData),ref(vtxNum),(arrL3),ref(pathL4SmHead));
//4tail=2whole+3tail
//L4tailIdx(gTestData, pathL3SmTail, vtxNum, arrL3, pathL4SmTailIdx);
threads2[1] = thread(L4tailIdx, ref(gTestData), ref(pathL3SmTail),ref(vtxNum),(arrL3),ref(pathL4SmTailIdx));
//3cycle=3head+2whole
//L3cycle(pathL2SmHead, pathL3SmTailIdx, vtxNum, arrL3, p3, idxMap, cycleCount);
threads2[2] = thread(L3cycle, ref(pathL2SmHead), ref(pathL3SmTailIdx),ref(vtxNum),(arrL3),ref(p3),ref(idxMap),ref(cycleCount3));
//5tail=3whole+3tail
//L5tailIdx(pathL3, pathL3SmTail, vtxNum, arrL4, pathL5SmTailIdx);
threads2[3] = thread(L5tailIdx, ref(pathL3), ref(pathL3SmTail),ref(vtxNum),(arrL4),ref(pathL5SmTailIdx));
//4cycle=3head+3tail
//L4cycle(pathL3SmHead, pathL3SmTailIdx, vtxNum, arrL4, p4, idxMap, cycleCount);
threads2[4] = thread(L4cycle, ref(pathL3SmHead), ref(pathL3SmTailIdx),ref(vtxNum),(arrL4),ref(p4),ref(idxMap),ref(cycleCount4));
for (auto& it : threads2) {
it.join();
}
thread threads3[3];
//5cycle=4head+3tail
//L5cycle(pathL4SmHead, pathL3SmTailIdx, vtxNum, arrL5, p5, idxMap, cycleCount);
threads3[0] = thread(L5cycle, ref(pathL4SmHead), ref(pathL3SmTailIdx),ref(vtxNum),(arrL5),ref(p5),ref(idxMap),ref(cycleCount5));
//6cycle=4head+4tail
//L6cycle(pathL4SmHead, pathL4SmTailIdx, vtxNum, arrL6, p6, idxMap, cycleCount);
threads3[1] = thread(L6cycle, ref(pathL4SmHead), ref(pathL4SmTailIdx),ref(vtxNum),(arrL6),ref(p6),ref(idxMap),ref(cycleCount6));
//7cycle=4head+5tail
//L7cycle(pathL4SmHead, pathL5SmTailIdx, vtxNum, arrL7, p7, idxMap, cycleCount);
threads3[2] = thread(L7cycle, ref(pathL4SmHead), ref(pathL5SmTailIdx),ref(vtxNum),(arrL7),ref(p7),ref(idxMap),ref(cycleCount7));
for (auto& it : threads3) {
it.join();
}
cycleCount = cycleCount3+cycleCount4+cycleCount5+cycleCount6+cycleCount7;
writeResultData(resultDataPath, p3, p4, p5, p6, p7, cycleCount);
#ifdef TEST
cout << "cycle count: " << cycleCount << endl;
#endif
return 0;
}
3、
//多线程实现
#include <iostream>
#include <algorithm>
#include <unordered_map>
#include <string.h>
#include <vector>
#include<chrono>
#include<mutex>
#include<thread>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
//#define TEST
#define is_digit(c) (c >= '0')
using namespace std;
const int MAX_REC = 2000000;
const int MAX_CYCLE_COUNT = 20000000;
const int MAX_ID_COUNT = 500000;//maximum:MAX_REC<<2
const int MAX_DEG = 300;//maximum:MAX_ID_COUNT-1
const int MAX_TEST_DATA_SIZE = MAX_REC * 33;//maximum: MAX_REC * 33
const int MAX_CYCLE3_CHAR_SIZE = MAX_CYCLE_COUNT*0.25 * 33;//maximum: MAX_CYCLE_COUNT * 33
const int MAX_CYCLE4_CHAR_SIZE = MAX_CYCLE_COUNT*0.25 * 44;//maximum: MAX_CYCLE_COUNT * 44
const int MAX_CYCLE5_CHAR_SIZE = MAX_CYCLE_COUNT*0.5 * 55;//maximum: MAX_CYCLE_COUNT * 55
const int MAX_CYCLE6_CHAR_SIZE = MAX_CYCLE_COUNT*0.8 * 66;//maximum: MAX_CYCLE_COUNT * 66
const int MAX_CYCLE7_CHAR_SIZE = MAX_CYCLE_COUNT*0.8 * 77;//maximum: MAX_CYCLE_COUNT * 77
const int MAX_CYCLE_CHAR_SIZE = MAX_CYCLE_COUNT * 77;//maximum: MAX_CYCLE_COUNT * 77
struct Origin_Id{
int id;
char* pos;
int len;
Origin_Id() :id(), pos(), len(){};
Origin_Id(int id_, char* pos_, int len_) :id(id_), pos(pos_), len(len_)
{};
};
string test_data_path = "/data/test_data.txt";
string result_data_path = "/projects/student/result.txt";
//char file_ptr[MAX_TEST_DATA_SIZE];
//int Rec[MAX_REC][3];
int(*Rec)[3];
//Origin_Id id_set[MAX_REC];
Origin_Id* id_set;
int rec_count = 0;
int id_count = 0;
void read_raw_data()
{
char* file_ptr = new char[MAX_TEST_DATA_SIZE];
id_set = new Origin_Id[MAX_REC * 2];
Rec = new int[MAX_REC][3];
FILE *fp;
fp = fopen(test_data_path.c_str(), "rb");
int data_sz = fread(file_ptr, 1, MAX_TEST_DATA_SIZE, fp);
char* buf_end = file_ptr + data_sz;
char* buf = file_ptr;
int comma_count = 0;
int cur_num = 0;
int id_len = 0;
char* id_pos = buf;
int comma_num = 0;
while (buf < buf_end)
{
switch (*buf)
{
case '\n'://reset counter
{
rec_count++;
Rec[rec_count][0] = 0;
Rec[rec_count][1] = 0;
Rec[rec_count][2] = 0;
comma_count = 0;
cur_num = 0;
id_len = 0;
id_pos = buf + 1;
break;
}
case ',':
{
Origin_Id id_tmp(cur_num, id_pos, id_len);
id_set[id_count++] = id_tmp;
comma_count++;
cur_num = 0;
id_len = 0;
id_pos = buf + 1;
break;
}
default:
{
if (is_digit(*buf))
{
cur_num = cur_num * 10 + *buf - '0';
Rec[rec_count][comma_count] = cur_num;
id_len++;
}
}
}
buf++;
}
}
bool id_cmp(Origin_Id a, Origin_Id b)
{
return a.id<b.id;
}
bool id_unq(Origin_Id a, Origin_Id b)
{
return a.id == b.id;
}
int id_num;
void id_map(unordered_map<int, int> &id_map_rev)
{
//unordered_map<int, int> id_map_rev;
sort(id_set, id_set + id_count, id_cmp);
id_num = unique(id_set, id_set + id_count, id_unq) - id_set;//now id_set is id_map
for (int i = 0; i < id_num; ++i)
{
id_map_rev.insert({ id_set[i].id, i });
}
}
struct Edge{
int id;
int money;
Edge() :id(), money(){};
Edge(int id_, int money_) :id(id_), money(money_){};
}__attribute__((packed));
//__attribute__((aligned(x))),
vector<vector<Edge>> Graph(MAX_ID_COUNT);
vector<vector<Edge>> ReGraph(MAX_ID_COUNT);
//Edge Graph[MAX_ID_COUNT][MAX_DEG];
//Edge ReGraph[MAX_ID_COUNT][MAX_DEG];
int deg_G[MAX_ID_COUNT];
int deg_RG[MAX_ID_COUNT];
bool id_cmp_edge(Edge a, Edge b)
{
return a.id<b.id;
}
void buid_graph()
{
unordered_map<int, int> id_map_rev;
id_map(id_map_rev);
int id_from;
int id_to;
int money;
for (int i = 0; i < rec_count; i++)
{
id_from = id_map_rev[Rec[i][0]];
id_to = id_map_rev[Rec[i][1]];
if (id_from == id_to) continue;
money = Rec[i][2];
Edge edge_tmp(id_to, money);
/* if (money > 715827882)
{
edge_tmp.money3 = 2147483647;
}
if (money > 429496729)
{
edge_tmp.money5 = 2147483647;
} */
Graph[id_from].emplace_back(edge_tmp);
edge_tmp.id = id_from;
ReGraph[id_to].emplace_back(edge_tmp);
deg_G[id_from]++;
deg_RG[id_to]++;
}
for (int i = 0; i<id_num; i++)
{
sort(Graph[i].begin(), Graph[i].begin() + deg_G[i], id_cmp_edge);
//sort(ReGraph[i].begin(), ReGraph[i].begin() + deg_RG[i], id_cmp_edge);
}
}
inline bool judgeMoneyRe(double money_3,long long int money5,int money)
{
return (money_3<=money)&&(money<=money5);
}
inline int mtp3(int momey)
{
int money3;
if(momey<=715827882)
{
money3 = momey*3;
}
else
{
money3 = 2147483647;
}
return money3;
}
inline int mtp5(int momey)
{
int money5;
if(momey<=429496729)
{
money5 = momey*5;
}
else
{
money5 = 2147483647;
}
return money5;
}
bool DFSInv(int &minCur, \
bool(&toMinCurL1)[MAX_ID_COUNT], \
bool(&toMinCurL2)[MAX_ID_COUNT], \
bool(&toMinCurL3)[MAX_ID_COUNT], \
int(&toMinCurL1Idx)[MAX_ID_COUNT]
)
{
int money[3];
double money_3[3];
long int money5[3];
bool flg = false;
for (int i = 0; i < deg_RG[minCur]; ++i)
{
int it = ReGraph[minCur][i].id;
int money0 = ReGraph[minCur][i].money;
money[0] = money0;
money_3[0] = money0/3.0;
money5[0] = mtp5(money0);
if (minCur < it)
{
toMinCurL1[it] = true;
toMinCurL1Idx[it] = i;
for (int j = 0; j < deg_RG[it]; ++j)
{
int ite = ReGraph[it][j].id;
int money1 = ReGraph[it][j].money;
if (minCur < ite&&judgeMoneyRe(money_3[0],money5[0],money1))
{
//if(!flg)
flg = true;
toMinCurL2[ite] = true;
money[1] = money1;
money_3[1] = money1/3.0;
money5[1] = mtp5(money1);
for (int k = 0; k < deg_RG[ite]; ++k)
{
int iter = ReGraph[ite][k].id;
int money2 = ReGraph[ite][k].money;
if (minCur < iter&&iter != it&&judgeMoneyRe(money_3[1],money5[1],money2))
{
toMinCurL3[iter] = true;
}
}
}
}
}
}
return flg;
}
bool judgeMoneyCycle(int(&stk_mny)[7], int num)
{
int money3;
int money5;
for (int i = 0; i < num - 1; ++i)
{
if(stk_mny[i]<=715827882)
{
money3 = stk_mny[i]*3;
}
else
{
money3 = 2147483647;
}
if(stk_mny[i + 1]<=429496729)
{
money5 = stk_mny[i+1]*5;
}
else
{
money5 = 2147483647;
}
if (money3<stk_mny[i + 1] || stk_mny[i]>money5)
return false;
}
if(stk_mny[num-1]<=715827882)
{
money3 = stk_mny[num-1]*3;
}
else
{
money3 = 2147483647;
}
if(stk_mny[0]<=429496729)
{
money5 = stk_mny[0]*5;
}
else
{
money5 = 2147483647;
}
if (money3<stk_mny[0] || stk_mny[num - 1]>money5)
return false;
return true;
}
void piece_memcpy(char *dest,char *src, const int n) {
switch(n) {
case 33 : *dest++ = *src++;
case 32 : *dest++ = *src++;
case 31 : *dest++ = *src++;
case 30 : *dest++ = *src++;
case 29 : *dest++ = *src++;
case 28 : *dest++ = *src++;
case 27 : *dest++ = *src++;
case 26 : *dest++ = *src++;
case 25 : *dest++ = *src++;
case 24 : *dest++ = *src++;
case 23 : *dest++ = *src++;
case 22 : *dest++ = *src++;
case 21 : *dest++ = *src++;
case 20 : *dest++ = *src++;
case 19 : *dest++ = *src++;
case 18 : *dest++ = *src++;
case 17 : *dest++ = *src++;
case 16 : *dest++ = *src++;
case 15 : *dest++ = *src++;
case 14 : *dest++ = *src++;
case 13 : *dest++ = *src++;
case 12 : *dest++ = *src++;
case 11 : *dest++ = *src++;
case 10 : *dest++ = *src++;
case 9 : *dest++ = *src++;
case 8 : *dest++ = *src++;
case 7 : *dest++ = *src++;
case 6 : *dest++ = *src++;
case 5 : *dest++ = *src++;
case 4 : *dest++ = *src++;
case 3 : *dest++ = *src++;
case 2 : *dest++ = *src++;
case 1 : *dest++ = *src++; break;
default:
exit(0);
}
}
inline int idToAscii(int value, char* dst) {
int len = id_set[value].len;
memcpy(dst, id_set[value].pos, len);
return len;
}
char* L[5];
char* pL[5];
void arr2Ascii(int(&arr)[7], int num)
{
char* &p = pL[num - 3];
int dig;
dig = idToAscii(arr[0], p);
p += dig;
for (int i = 1; i < num; ++i)
{
*p++ = ',';
dig = idToAscii(arr[i], p);
p += dig;
}
*p++ = '\n';
}
char* pT[4][5];
char* T[4][5];
int cycle_counts[4];
char* task_cycle_pos[20000][10];
void arr2AsciiThread(int(&arr)[7], int num, int &thread_id, int &task_id)
{
int cycle_type = num - 3;
char* &p = pT[thread_id][cycle_type];
int dig;
dig = idToAscii(arr[0], p);
p += dig;
for (int i = 1; i < num; ++i)
{
*p++ = ',';
dig = idToAscii(arr[i], p);
p += dig;
}
*p++ = '\n';
task_cycle_pos[task_id][cycle_type + 5] = p;
}
inline void pushMoney(int (&money)[6],double (&money_p2)[6],int (&money3)[6],int money_tmp, int layer)
{
money[layer] = money_tmp;
money_p2[layer] = 0.2*money_tmp;
money3[layer] = mtp3(money_tmp);
}
inline bool judgeMoney(double money_p2,int money3, int money)
{
return (money_p2<=money&&money<=money3);
}
void DFS(const int minCur, \
int(&stk)[7], int(&stk_mny)[7], bool(&onStack)[MAX_ID_COUNT], bool(&toMinCurL1)[MAX_ID_COUNT], \
bool(&toMinCurL2)[MAX_ID_COUNT], \
bool(&toMinCurL3)[MAX_ID_COUNT], \
int(&toMinCurL1Idx)[MAX_ID_COUNT], int &thread_id, int &task_id,\
int (&money)[6],double (&money_p2)[6],int (&money3)[6])
{
stk[0] = minCur;
int s1 = minCur;
onStack[s1] = true;
int money_tmp;
double moneyO_3;
int moneyO5;
for (int i2 = 0; i2 < deg_G[s1]; i2++)
{
auto s2 = Graph[s1][i2].id;
if (s2 <= minCur) continue;
stk[1] = s2;
onStack[s2] = true;
money_tmp = Graph[s1][i2].money;
pushMoney(money,money_p2,money3, money_tmp, 0);
moneyO_3 = money_tmp/3.0;
moneyO5 = mtp5(money_tmp);
for (int i3 = 0; i3 < deg_G[s2]; i3++)
{
auto s3 = Graph[s2][i3].id;
money_tmp = Graph[s2][i3].money;
if (s3 <= minCur || onStack[s3] || !judgeMoney(money_p2[0],money3[0],money_tmp)) continue;
pushMoney(money,money_p2,money3, money_tmp, 1);
//pushMoney(stk_mny, s2, i3, 1);
stk[2] = s3;
onStack[s3] = true;
if (toMinCurL1[s3])
{
//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s3], 2);
money_tmp = ReGraph[minCur][toMinCurL1Idx[s3]].money;
//if (judgeMoneyCycle(stk_mny, 3))
if (judgeMoney(money_p2[1],money3[1],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp))
{
arr2AsciiThread(stk, 3, thread_id, task_id);
cycle_counts[thread_id]++;
}
}
for (int i4 = 0; i4 < deg_G[s3]; i4++)
{
auto s4 = Graph[s3][i4].id;
money_tmp = Graph[s3][i4].money;
if (onStack[s4] || s4 <= minCur||!judgeMoney(money_p2[1],money3[1],money_tmp)) continue;
stk[3] = s4;
onStack[s4] = true;
//pushMoney(stk_mny, s3, i4, 2);
pushMoney(money,money_p2,money3, money_tmp, 2);
if (toMinCurL1[s4])
{
money_tmp = ReGraph[minCur][toMinCurL1Idx[s4]].money;
//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s4], 3);
//if (judgeMoneyCycle(stk_mny, 4))
if (judgeMoney(money_p2[2],money3[2],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp))
{
arr2AsciiThread(stk, 4, thread_id, task_id);
cycle_counts[thread_id]++;
}
}
for (int i5 = 0; i5 < deg_G[s4]; i5++)
{
auto s5 = Graph[s4][i5].id;
money_tmp = Graph[s4][i5].money;
if (minCur >= s5 || onStack[s5] || !judgeMoney(money_p2[2],money3[2],money_tmp)||!toMinCurL3[s5] && !toMinCurL2[s5] && !toMinCurL1[s5]) continue;
stk[4] = s5;
pushMoney(money,money_p2,money3, money_tmp, 3);
if (toMinCurL1[s5])
{
//stk[4] = s5;
//pushMoney(stk_mny, s4, i5, 3);
//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s5], 4);
money_tmp = ReGraph[minCur][toMinCurL1Idx[s5]].money;
if (judgeMoney(money_p2[3],money3[3],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp))
{
arr2AsciiThread(stk, 5, thread_id, task_id);
cycle_counts[thread_id]++;
}
}
if (toMinCurL2[s5])
{
//stk[4] = s5;
//pushMoney(stk_mny, s4, i5, 3);
//pushMoney(money,money_p2,money3, money_tmp, 3);
for (int i6 = 0; i6 < deg_G[s5]; i6++)
{
auto s6 = Graph[s5][i6].id;
money_tmp = Graph[s5][i6].money;
if (minCur<s6&&!onStack[s6] && toMinCurL1[s6]&&judgeMoney(money_p2[3],money3[3],money_tmp))
{
stk[5] = s6;
//money_tmp = Graph[s5][i6].money;
//pushMoney(stk_mny, s5, i6, 4);
//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s6], 5);
pushMoney(money,money_p2,money3, money_tmp, 4);
money_tmp = ReGraph[minCur][toMinCurL1Idx[s6]].money;
if (judgeMoney(money_p2[4],money3[4],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp))
{
arr2AsciiThread(stk, 6, thread_id, task_id);
cycle_counts[thread_id]++;
}
}
}
}
if (toMinCurL3[s5])
{
//stk[4] = s5;
onStack[s5] = true;
//pushMoney(stk_mny, s4, i5, 3);
for (int i6 = 0; i6 < deg_G[s5]; i6++)
{
auto s6 = Graph[s5][i6].id;
money_tmp = Graph[s5][i6].money;
if (toMinCurL2[s6] && !onStack[s6] && minCur <= s6&&judgeMoney(money_p2[3],money3[3],money_tmp))
{
stk[5] = s6;
//pushMoney(stk_mny, s5, i6, 4);
pushMoney(money,money_p2,money3, money_tmp, 4);
for (int i7 = 0; i7 < deg_G[s6]; i7++)
{
auto s7 = Graph[s6][i7].id;
money_tmp = Graph[s6][i7].money;
if (minCur<s7&&toMinCurL1[s7] && !onStack[s7]&&judgeMoney(money_p2[4],money3[4],money_tmp))
{
stk[6] = s7;
pushMoney(money,money_p2,money3, money_tmp, 5);
//pushMoney(stk_mny, s6, i7, 5);
//pushMoneyLast(stk_mny, minCur, toMinCurL1Idx[s7], 6);
money_tmp = ReGraph[minCur][toMinCurL1Idx[s7]].money;
//if (judgeMoneyCycle(stk_mny, 7))
if (judgeMoney(money_p2[5],money3[5],money_tmp)&&judgeMoney(moneyO_3,moneyO5,money_tmp))
{
arr2AsciiThread(stk, 7, thread_id, task_id);
cycle_counts[thread_id]++;
}
}
}
}
}
onStack[s5] = false;
}
}
onStack[s4] = false;
}
onStack[s3] = false;
}
onStack[s2] = false;
}
onStack[minCur] = false;
}
void do_task(int cur_task_id, int(&stk)[7], int(&stk_mny)[7], bool(&onStack)[MAX_ID_COUNT], \
bool(&toMinCurL1)[MAX_ID_COUNT], bool(&toMinCurL2)[MAX_ID_COUNT], bool(&toMinCurL3)[MAX_ID_COUNT], \
int(&toMinCurL1Idx)[MAX_ID_COUNT], vector<int> &task_set, int &thread_id, int &task_id,\
int (&money)[6],double (&money_p2)[6],int (&money3)[6])
{
int from = task_set[cur_task_id];
int to = task_set[cur_task_id + 1];
for (int minCur = from; minCur < to; ++minCur)
{
memset(toMinCurL1, false, id_num*sizeof(bool));
memset(toMinCurL2, false, id_num*sizeof(bool));
memset(toMinCurL3, false, id_num*sizeof(bool));
bool flg = DFSInv(minCur, toMinCurL1, toMinCurL2, toMinCurL3, toMinCurL1Idx);
if (flg == true)
{
DFS(minCur, stk, stk_mny, onStack, toMinCurL1, toMinCurL2, toMinCurL3, toMinCurL1Idx, thread_id, task_id,money,money_p2,money3);
}
}
}
vector<mutex> mtx(20000);
void dfs_thread(int thread_id, vector<int> &task_set)
{
bool onStack[MAX_ID_COUNT] = { false };
int stk[7];
int stk_mny[7];
bool toMinCurL1[MAX_ID_COUNT] = { false };
bool toMinCurL2[MAX_ID_COUNT] = { false };
bool toMinCurL3[MAX_ID_COUNT] = { false };
int toMinCurL1Idx[MAX_ID_COUNT];
int cur_task_id = 0;
char* ptr_from_cur[5] = { pT[thread_id][0], pT[thread_id][1], pT[thread_id][2],pT[thread_id][3],pT[thread_id][4] };
int money[6];
double money_p2[6];
int money3[6];
while(cur_task_id<task_set.size()-1)
{
if(mtx[cur_task_id].try_lock())
{
memcpy(task_cycle_pos[cur_task_id], ptr_from_cur, sizeof(char*)* 5);
memcpy(task_cycle_pos[cur_task_id]+5, ptr_from_cur, sizeof(char*)* 5);
do_task(cur_task_id, stk, stk_mny, onStack, toMinCurL1, toMinCurL2, toMinCurL3, toMinCurL1Idx, task_set,thread_id,cur_task_id,\
money,money_p2,money3);
memcpy(ptr_from_cur, task_cycle_pos[cur_task_id]+5, sizeof(char*)* 5);
}
cur_task_id++;
}
}
int task_num;
void run_dfs()
{
vector<int> task_set;
task_num = id_num / 100+1;
for (int i = 0; i < task_num; i++)
{
task_set.push_back(i * 100);
}
task_set.push_back(id_num+1);
thread threads[4];
for (int i = 0; i < 4; i++)
{
threads[i] = thread(dfs_thread, i, ref(task_set));
}
for (auto& it : threads) {
it.join();
}
}
int getDigits(int v) {
if (v < 10) return 1;
if (v < 100) return 2;
if (v < 1000) return 3;
if (v < 1000000000000) { // 10^12
if (v < 100000000) { // 10^7
if (v < 1000000) { // 10^6
if (v < 10000) return 4;
return 5 + (v >= 100000); // 10^5
}
return 7 + (v >= 10000000); // 10^7
}
if (v < 10000000000) { // 10^10
return 9 + (v >= 1000000000); // 10^9c
}
return 11 + (v >= 100000000000); // 10^11
}
return 12 + getDigits(v / 1000000000000); // 10^12
}
int intToAscii(int value, char* dst) {
static const char digits[] = {
"0001020304050607080910111213141516171819"
"2021222324252627282930313233343536373839"
"4041424344454647484950515253545556575859"
"6061626364656667686970717273747576777879"
"8081828384858687888990919293949596979899" };
const int length = getDigits(value);
int next = length - 1;
while (value >= 100) {
const int i = (value % 100) * 2;
value /= 100;
dst[next - 1] = digits[i];
dst[next] = digits[i + 1];
next -= 2;
}
// Handle last 1-2 digits
if (value < 10) {
dst[next] = '0' + int(value);
}
else {
int i = int(value) * 2;
dst[next - 1] = digits[i];
dst[next] = digits[i + 1];
}
return length;
}
char *LAll;
int cycleCount = 0;
void write_th(int th_id,int (&task_sz)[5],void *(&dst_ptr))
{
int from = task_sz[th_id];
int size = task_sz[th_id+1]-from;
// cout<<endl<<th_id<<" "<<from<<" "<<size<<endl;
// cout<<(char*)dst_ptr+size-(char*)dst_ptr<<endl;
memcpy((char*)dst_ptr+from,LAll+from,size);
}
void write_result()
{
char* pAll = LAll;
int dig = intToAscii(cycleCount, pAll);
int sz;
pAll += dig;
*pAll++ = '\n';
for (int j = 0; j < 5; j++)
{
for (int i = 0; i < task_num; i++)
{
auto pos_ptr = task_cycle_pos[i];
sz = pos_ptr[j + 5] - pos_ptr[j];
memcpy(pAll, (pos_ptr[j]), sz);
pAll += sz;
}
}
int total = pAll - LAll;
int task_sz = total/4;
int task_set[5] = {0,task_sz,2*task_sz,3*task_sz,total+1};
int fd = open(result_data_path.c_str(),O_RDWR|O_CREAT,0666);
//cout<<111<<endl;
truncate(result_data_path.c_str(),total);
//cout<<total<<endl;
//cout<<task_sz<<endl;
// for(int i=0;i<5;i++)
// {
// cout<<task_set[i]<<endl;
// }
void *dst_ptr=mmap(NULL,total,PROT_WRITE|PROT_READ,MAP_SHARED,fd,0);
//memcpy(dst_ptr,LAll,total);
thread threads[4];
for (int i = 0; i < 4; i++)
{
threads[i] = thread(write_th, i, ref(task_set),ref(dst_ptr));
}
for (auto& it : threads) {
it.join();
}
//exit(0);
//munmap(dst_ptr,total);
}
int main()
{
#ifdef TEST
test_data_path = "./data/std/test_data.txt";
result_data_path = "./data/myresult.txt";
auto s1 = chrono::steady_clock::now();
#endif
// int a = 3;
// double b = 3.3;
// bool flg = (a>b);
// cout << flg <<endl;
read_raw_data();
#ifdef TEST
auto s2 = chrono::steady_clock::now();
cout << "read data time: " << chrono::duration_cast<chrono::duration<double>>(s2 - s1).count() << " s" << endl;
#endif
//id_map();
buid_graph();
#ifdef TEST
auto s3 = chrono::steady_clock::now();
cout << "build graph time: " << chrono::duration_cast<chrono::duration<double>>(s3 - s2).count() << " s" << endl;
#endif
// L[0] = new char[MAX_CYCLE3_CHAR_SIZE / 10];
// L[1] = new char[MAX_CYCLE4_CHAR_SIZE / 100];
// L[2] = new char[MAX_CYCLE5_CHAR_SIZE / 100];
// L[3] = new char[MAX_CYCLE6_CHAR_SIZE / 100];
// L[4] = new char[MAX_CYCLE7_CHAR_SIZE / 100];
for (int i = 0; i<5; ++i)
{
pL[i] = L[i];
}
LAll = new char[MAX_CYCLE_CHAR_SIZE];
for (int i = 0; i < 4; i++)
{
T[i][0] = new char[MAX_CYCLE3_CHAR_SIZE];
pT[i][0] = T[i][0];
T[i][1] = new char[MAX_CYCLE4_CHAR_SIZE];
pT[i][1] = T[i][1];
T[i][2] = new char[MAX_CYCLE5_CHAR_SIZE];
pT[i][2] = T[i][2];
T[i][3] = new char[MAX_CYCLE6_CHAR_SIZE];
pT[i][3] = T[i][3];
T[i][4] = new char[MAX_CYCLE7_CHAR_SIZE];
pT[i][4] = T[i][4];
}
run_dfs();
#ifdef TEST
auto s4 = chrono::steady_clock::now();
cout << "dfs time: " << chrono::duration_cast<chrono::duration<double>>(s4 - s3).count() << " s" << endl;
#endif
for (int i = 0; i < 4; i++)
cycleCount += cycle_counts[i];
write_result();
#ifdef TEST
auto s5 = chrono::steady_clock::now();
cout << "write result time: " << chrono::duration_cast<chrono::duration<double>>(s5 - s4).count() << " s" << endl;
#endif
#ifdef TEST
cout << id_num << endl;
cout << cycleCount << endl;
#endif
exit(0);
}