从clock_gettime和gettimeofday开始谈linux下函数耗时计算

从clock_gettime和gettimeofday开始谈linux下函数耗时计算

引言

在OpenCV中有如下的两个函数用来获取当前时间。

static long long getTickCount(void)
{
#if defined _WIN32 || defined WINCE
    LARGE_INTEGER counter;
    QueryPerformanceCounter( &counter );
    return (long long)counter.QuadPart;
#elif defined __linux || defined __linux__
    struct timespec tp;
    clock_gettime(CLOCK_MONOTONIC, &tp);
    return (long long)tp.tv_sec*1000000000 + tp.tv_nsec;
#elif defined __MACH__ && defined __APPLE__
    return (long long)mach_absolute_time();
#else
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return (long long)tv.tv_sec*1000000 + tv.tv_usec;
#endif
}

CV_EXPORTS_W double getTickFrequency(void)
{
#if defined _WIN32 || defined WINCE
    LARGE_INTEGER freq;
    QueryPerformanceFrequency(&freq);
    return (double)freq.QuadPart;
#elif defined __linux || defined __linux__
    return 1e9; 
#elif defined __MACH__ && defined __APPLE__
    static double freq = 0; 
    if( freq == 0 )
    {    
        mach_timebase_info_data_t sTimebaseInfo;
        mach_timebase_info(&sTimebaseInfo);
        freq = sTimebaseInfo.denom*1e9/sTimebaseInfo.numer;
    }    
    return freq;
#else
    return 1e6; 
#endif
}

我们可以先后调用两次getTickCount函数两次来计算某些操作耗时多少。

long long t_stat = getTickCount()
...
long long t_end = getTickCount()

double time_elapsed = (double)(dut_end - t_stat)/getTickFrequency()

在上面的函数中,我们可以看到:计算时间差,主要使用的是clock_gettime和gettimeofday。而且,clock_gettime优于gettimeofday。那么,这是为什么呢?

clock_gettime & gettimeofday

从精度上比较。

gettimeofday返回从Epoch到现在的秒数和微秒数。

The functions gettimeofday() and settimeofday() can get and set the time as well as a timezone.  The tv argument is a struct timeval (as specified in <sys/time.h>):

           struct timeval {
               time_t      tv_sec;     /* seconds */
               suseconds_t tv_usec;    /* microseconds */
           };
and gives the number of seconds and microseconds since the Epoch (see time(2)).

clock_gettime返回的时间的精度最低可以到纳秒

The functions clock_gettime() and clock_settime() retrieve and set the time of the specified clock clk_id.

       The res and tp arguments are timespec structures, as specified in <time.h>:

           struct timespec {
               time_t   tv_sec;        /* seconds */
               long     tv_nsec;       /* nanoseconds */
           };

从时间单调性分析

clock_gettime可以返回单调连续的时间(使用CLOCK_MONOTONIC clock_id),而gettimeofday不可以。

 This clock is not affected by discontinuous jumps in the system time (e.g., if the system  administrator  manually
              changes the clock), but is affected by the incremental adjustments performed by adjtime(3) and NTP

不同时钟源对clock_gettime的影响。

clock sourcedescription
CLOCK_REALTIMERepresents wall-clock time. Can be both stepped and slewed by time adjustment code (e.g., NTP, PTP).
CLOCK_REALTIME_COARSEA lower-resolution version of CLOCK_REALTIME.
CLOCK_REALTIME_HRA higher-resolution version of CLOCK_REALTIME. Only available with the real-time kernel.
CLOCK_MONOTONICRepresents the interval from an abitrary time. Can be slewed but not stepped by time adjustment code. As such, it can only move forward, not backward.
CLOCK_MONOTONIC_COARSEA lower-resolution version of CLOCK_MONOTONIC.
CLOCK_MONOTONIC_RAWA version of CLOCK_MONOTONIC that can neither be slewed nor stepped by time adjustment code.
CLOCK_BOOTTIMEA version of CLOCK_MONOTONIC that additionally reflects time spent in suspend mode. Only available in newer (2.6.39+) kernels.

如果我们的任务只是获取系统的时间,那么到现在为止已经足够了。但是,系统是如何保存time tick 以及time tick的精度又是怎样的呢?

Linux如何保存tick count & tick count的精度

tick count由时钟源产生,它的精度由时钟源的频率决定。那么,我们可以使用的时钟源有哪些呢?

  • The TSC is a register counter that is also driven from a crystal oscillator – the same oscillator that is used to generate the clock pulses that drive the CPU(s). As such it runs at the frequency of the CPU, so for instance a 2GHz clock will tick twice per nanosecond.

  • The HPET (High Precision Event Timer) was introduced by Microsoft and Intel around 2005. Its precision is approximately 100 ns, so it is less accurate than the TSC, which can provide sub-nanosecond accuracy. It is also much more expensive to query the HPET than the TSC.

  • The acpi_pm clock source has the advantage that its frequency doesn’t change based on power-management code, but since it runs at 3.58MHz (one tick every 279 ns), it is not nearly as accurate as the preceding timers.

  • jiffies signifies that the clock source is actually the same timer used for scheduling, and as such its resolution is typically quite poor. (The default scheduling interval in most Linux variants is either 1 ms or 10 ms).

一般来说,我们使用TSC(Time stamp Counter)时钟源,因为他的精度高,损耗小。而且现代的CPU对TSC进行了优化,解决了很多之前存在的问题。

如何查看CPU支持TSC特性。

$ cat /proc/cpuinfo | grep -i tsc
flags : ... tsc  rdtscp constant_tsc nonstop_tsc ...

The flags have the following meanings:

FlagMeaning
tscThe system has a TSC clock.
rdtscpThe RDTSCP instruction is available.
constant_tscThe TSC is synchronized across all sockets/cores.
nonstop_tscThe TSC is not affected by power management code.

如何查看当前系统支持的时钟源

查看支持的时钟源:

$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm

查看当前使用的时钟源:

$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc

获取时钟的损耗

我们用clock_gettime来计算时间的流逝。除开时钟源tick count影响精度外,clock_gettime调用的时候自身也要花费一定的时间。
在我的虚拟机(i5 3.2G kvm_clock)上一次clock_gettime的消耗大概在400ns,基本上和一次系统调用的时间差不多。

参考引用

http://btorpey.github.io/blog/2014/02/18/clock-sources-in-linux/

  • 16
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值