本文主要参考:http://man7.org/linux/man-pages/man2/perf_event_open.2.html
场景:项目中需要一个能实时检测某个进程的IPC(instructions per cycles)方法。
用它最开始是论文《CPI²:CPU performance isolation for shared compute clusters》 3.1中介绍到
CPI sampling CPIdata is derived from hardware counters, nd is defined as the value of the CPU CLKUNHALTED.REF counter divided by the INSTRUCTIONS RETIRED counter. These are counted simultaneously, and collected on apercgroup basis. (Per-CPU counting wouldn’t work because several unrelated tasks frequently timeshare a single CPU (hardware context). Per-thread counting would require too much memory: running thousands of threads on a machine is not uncommon (figure 1b).)
The CPI data is sampled periodically by a system daemon using the perf event tool[13] in counting mode (rather than sampling mode) to keep overhead to a minimum. We
gather CPI data for a 10 second period once a minute; we picked this fraction to give other measurement tools time to use the counters. The counters are saved/restored when a context switch changes to a thread from a differentcgroup, which costs a couple of microseconds. Total CPU overhead is less than 0.1% and incurs no visible latency impact toour users.
[13] ERANIAN,S. perfmon2: the hardware-based performance monitoring interface for Linux. http://perfmon2. sourceforge.net/, 2008.
尝试安装使用perfmon2去对IPC做实施检测收集,我太笨了 >_< 还是没有搞会具体怎么使用...只能运行example和perf_example下的实例程序。
后来google到http://man7.org/linux/man-pages/man2/perf_event_open.2.html对perf_event_open()函数做了详细的解释。
其实在perfmon2的学习以及程序中,我们可以看到perfmon2也是调用到perf_event_open()函数(不过它支持更多的events,可以执行example文件夹下的程序看到支持的events)
int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu, int group_fd, unsigned long flags);
DESCRIPTION top
Given a list of parameters, perf_event_open() returns a file descriptor, for use in subsequent system calls (read(2), mmap(2), prctl(2), fcntl(2), etc.). A call to perf_event_open() creates a file descriptor that allows measuring performance information. Each file descriptor corresponds to one event that is measured; these can be grouped together to measure multiple events simultaneously. Events can be enabled and disabled in two ways: via ioctl(2) and via prctl(2). When an event is disabled it does not count or generate overflows but does continue to exist and maintain its count value. Events come in two flavors: counting and sampled. A counting event is one that is used for counting the aggregate number of events that occur. In general, counting event results are gathered with a read(2) call. A sampling event periodically writes measurements to a buffer that can then be accessed via mmap(2).
对perf_event_open()的一次调用创建一个文件描述符,它运行测量性能信息。每个文件描述符与被检测的一个事件相对应,这些可以同时的被分组一起来检测多个事件。
事件可以通过两种方法被激活或者关闭:通过 ioctl(2)和prctl(2)。当一个事件被关闭,它不count或者发生溢出,但它还是存在的并且有它的计算值。
事件的发生有两个特点(flavors):计算(counting)和采样(sampled)。一个计算事件被用来统计事件发生的集合数。通常来说,计算事件结果通过一个read(2)调用被收集。一个采样事件周期性的写测量值到一个能通过map(2)被访问的buffer。
以下为一个实例程序:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/ioctl.h>
#include <linux/perf_event.h>
#include <asm/unistd.h>
#define PID_NUM 123
static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
int cpu, int group_fd, unsigned long flags)
{
int ret;
ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
group_fd, flags);
return ret;
}
int
main(int argc, char **argv)
{
struct perf_event_attr pe;
long long count,cycles,instructions;
double ipc;
int fd;
int i;
memset(&pe, 0, sizeof(struct perf_event_attr));
pe.type = PERF_TYPE_HARDWARE;
pe.size = sizeof(struct perf_event_attr);
//pe.config = PERF_COUNT_HW_INSTRUCTIONS;
pe.config = PERF_COUNT_HW_CPU_CYCLES;
pe.disabled = 1;
pe.exclude_kernel = 1;
pe.exclude_hv = 1;
//count cycles;
fd = perf_event_open(&pe, PID_NUM, -1, -1, 0);
if (fd == -1) {
fprintf(stderr, "Error opening leader %llx\n", pe.config);
exit(EXIT_FAILURE);
}
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
read(fd, &count, sizeof(long long));
cycles=count;
//count instructions
pe.config = PERF_COUNT_HW_INSTRUCTIONS;
fd = perf_event_open(&pe, PID_NUM, -1, -1, 0);
if (fd == -1) {
fprintf(stderr, "Error opening leader %llx\n", pe.config);
exit(EXIT_FAILURE);
}
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
read(fd, &count, sizeof(long long));
instructions=count;
ipc=(double)instructions/(double)cycles;
printf("Used %lld instructions, %lld cycles ,ips=%f\n", instructions,cycles,ipc);
close(fd);
}