手撕KCF代码

本文详细解析了KCF算法的实现过程,包括使用C++与OpenCV进行图像处理的技术要点,如glob函数的使用、字符串操作、文件读写及性能统计等,适合对计算机视觉和算法优化感兴趣的读者。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近在自己重写Github上下载的KCF代码,想借此搞透这个算法,同时练习编程。作者用C++实现了KCF类,并调用OpenCV来做与图像处理相关的操作,啃代码和自己实现代码的过程中遇到了下述问题,逐一记录,全当总结。

1)glob。glob在opencv3下,并且命名空间为cv::glob()能够直接调用,在官网中只说明了如下调用方式,并没有给出具体的例子。但通过使用可以知道函数目的是将pattern路径下的所用文件名存进result中。

void cv::glob ( String pattern, std::vector< String > & result, bool recursive = false )

2)c_str()
c_str()函数返回一个指向正规C字符串的常量指针, 内容与本string串相同。这是为了与c语言兼容,在c语言中没有string类型,故必须通过string类对象的成员函数c_str()把string 对象转换成c中的字符串样式。原型:

const char *c_str();

注意:一定要使用strcpy()函数 等来操作方法c_str()返回的指针。

char c[20];
string s="1234";
strcpy(c,s.c_str());

再举个例子:

c_str() 以 char* 形式传回 string 内含字符串,如果一个函数要求char*参数,可以使用c_str()方法:

string s = "Hello World!";
printf("%s", s.c_str());    // 输出 "Hello World!"

const char *p; // 声明一个指向字符或字符串常量的指针(p所指向的内容不可修改)
char const *p;// 同上
char * const p;//声明一个指向字符或字符串的指针常量,即不可以修改p的值,也就是地址无法修改。

3)GetTickCount()
DWORD GetTickCount(void);
定义
For Release configurations, this function returns the number of milliseconds since the device booted, excluding any time that the system was suspended. GetTickCount starts at 0 on boot and then counts up from there.
在Release版本中,该函数从0开始计时,返回自设备启动后的毫秒数(不含系统暂停时间)。
For Debug configurations, 180 seconds is subtracted from the the number of milliseconds since the device booted. This allows code that uses GetTickCount to be easily tested for correct overflow handling.

在Debug版本中,设备启动后便从计时器中减去180秒。这样方便测试使用该函数的代码的正确溢出处理。

Return Values
The number of milliseconds indicates success.

返回值:如正确,返回毫秒数。

Header: Winbase.h.
Link Library: Coredll.lib.

4)DWORD
C++中使用DWORD不用声明,但是要加头文件Windows.h。 具体描述如下:

  1. DWORD 就是 Double Word, 每个word为2个字节的长度,DWORD 双字即为4个字节,每个字节是8位,共32位。
  2. DWORD的宏定义如下: #define DWORD unsigned long
  3. DWORD在Windows下经常用来保存地址(或者存放指针)。
  4. 使用时应该添加如下文件包含#include<windows.h> 。

5)转义字符
编程中,常用反斜杠\加字符表示转义字符,如\0表示空字符,\r表示回车,\n表示换行等。
而"\"则表示反斜杠\的转义字符,在编程中常用于表示反斜杠\不是普通的字符,而是路径的分隔符。如用一个字符串存储保存文件的路径时,路径为D:\badboy\html\images.jpg;则用字符串存储时,应该写为string str=“D:\badboy\html\image.jpg”;
因为若不这样表示,则反斜杠\将会当作普通字符,而非路径的分隔符。

6)string和ctring的区别
是C++标准库头文件,包含了拟容器class std::string的声明(不过class string事实上只是basic_string的typedef),用于字符串操作。
是C标准库头文件<string.h>的C++标准库版本,包含了C风格字符串(NUL即’\0’结尾字符串)相关的一些类型和函数的声明,例如strcmp、strchr、strstr等。
string.h是C++标准化(1998年)以前的C++库文件,在标准化过程中,为了兼容以前,标准化组织将所有这些文件都进行了新的定义,加入到了标准库中,加入后的文件名就新增了一个"c"前缀并且去掉了.h的后缀名,所以string.h头文件成了cstring头文件。但是其实现却是相同的或是兼容以前的。相当于标准库组织给它盖了个章,说“你也是我的标准程序库的一份子了”。
一般一个C++库老的版本带“.h”扩展名的库文件,比如iostream.h,在新标准后的标准库中都有一个不带“.h”扩展名的相对应,区别除了后者的好多改进之外,还有一点就是后者的东东都塞进了“std”名字空间中。
string,它是C++定义的std::string所使用的文件,是string类的头文件,属于STL范畴。它有很多对字符串操作的方法。

7)统计耗时
getTickCount():用于返回从操作系统启动到当前所经的计时周期数,看名字也很好理解,get Tick Count(s)。
getTickFrequency():用于返回CPU的频率。get Tick Frequency。这里的单位是秒,也就是一秒内重复的次数。

所以剩下的就很清晰了:
总次数/一秒内重复的次数 = 时间(s)
1000 *总次数/一秒内重复的次数= 时间(ms)

8)原作者在读取ground_truth.txt时用的是C++的I/O函数,即fstream一个对象fp,然后调用该函数的打开文件成员函数,即fp.fopen(“我是路径”),关闭时调用fp.close()。在此,对比一下C和C++进行文件读写的函数。

对比项CC++
头文件<stdio.h>
打开文件FILE* fp = fopen(“文件名字”,“w/r/a”);fstream gt;gt.fopen(“文件名字”)
关闭文件fclose(fp)gt.close()
读文本行fgets( char* _Buffer, int _MaxCount,FILE* _Stream)string line;getline(gt, line);
读字符getchar() /fgetc/getc成员函数get()
写字符putchar()/fputc/putc成员函数put()

对读入的一行字符串(left,top,width,height),需要将分割符’,'去掉并将字符转为数字,下面分别给出C和C++两种方法:
a)C方法:
定义字符ctmp和数字ntmp,用sscanf存入ch,最后atoi(ch);

	char ctmp1[5], ctmp2[5], ctmp3[5], ctmp4[5];
	int ntmp1, ntmp2, ntmp3, ntmp4;
	sscanf(cRead, "%[^','], %[^','], %[^','], %s", ctmp1, ctmp2, ctmp3, ctmp4);

	ntmp1 = atoi(ctmp1);
	ntmp2 = atoi(ctmp2);
	ntmp3 = atoi(ctmp3);
	ntmp4 = atoi(ctmp4);
	

b)C++方法:
先用string的成员函数.replace()把’,’ 换为’ ',再定义streamstring,把string赋值给streamstring,最后由streamstring读入整型变量。下面介绍streamstring:
定义了三个类:istringstream、ostringstream 和 stringstream,分别用来进行流的输入、输出和输入输出操作。本文以 stringstream 为主,介绍流的输入和输出操作。

主要用来进行数据类型转换,由于 使用 string 对象来代替字符数组(snprintf方式),就避免缓冲区溢出的危险;而且,因为传入参数和目标对象的类型会被自动推导出来,所以不存在错误的格式化符的问题。简单说,相比c库的数据类型转换而言, 更加安全、自动和直接。

	
	string line;
	std::replace(line.begin(), line.end(), ',', ' ');
	stringstream ss;
	ss.str(line);
	ss >> tmp1 >> tmp2 >> tmp3 >> tmp4;

9)fopen错误
fopen打开文件失败时,可以用perror函数,函数原型:
void perror(char const * message);
如果message不是NULL并且指向一个非空字符串,perror函数就打印出这个字符串,后面跟一个分号和一个空格,然后打印出一条用于解释errno当前错误代码的信息。错误代码解释:
https://blog.csdn.net/wssjn1994/article/details/99539888

## Tracking with Kernelized Correlation Filters Code author : Tomas Vojir ________________ This is a C++ reimplementation of algorithm presented in "High-Speed Tracking with Kernelized Correlation Filters" paper. For more info and implementation in other languages visit the [autor's webpage!](http://home.isr.uc.pt/~henriques/circulant/). It is extended by a scale estimation (use several *7* different scales steps) and by a RGB (channels) and Color Names [2] features. Data for Color Names features were obtained from [SAMF tracker](https://github.com/ihpdep/samf). It is free for research use. If you find it useful or use it in your research, please acknowledge my git repository and cite the original paper [1]. The code depends on OpenCV 2.4+ library and is build via cmake toolchain. _________________ Quick start guide for linux: open terminal in the directory with the code $ mkdir build; cd build; cmake .. ; make This code compiles into binary **kcf_vot** ./kcf_vot - using VOT 2014 methodology (http://www.votchallenge.net/) - INPUT : expecting two files, images.txt (list of sequence images with absolute path) and region.txt with initial bounding box in the first frame in format "top_left_x, top_left_y, width, height" or four corner points listed clockwise starting from bottom left corner. - OUTPUT : output.txt containing the bounding boxes in the format "top_left_x, top_left_y, width, height" ./kcf_trax - using VOT 2014+ trax protocol (http://www.votchallenge.net/) - require [trax](https://github.com/votchallenge/trax) library to be compiled with opencv support and installed. See trax instruction for compiling and installing. ___________ Performance | | **VOT2016 - baseline EAO** | **VOT2016 - unsupervised EAO** | [**TV77**](http://cmp.felk.cvut.cz/~vojirtom/dataset/index.html) Avg. Recall | |:---------------|:--------------:|:------------------:|:----------------:| | kcf |0.1530 | 0.3859 | 51% | | skcf |0.1661 | 0.4155 | 56% | | skcf-cn |0.178 | 0.4136 | 58% | | kcf-master |**0.1994** | **0.4376** | **63%** | __________ References [1] João F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista, “High-Speed Tracking with Kernelized Correlation Filters“, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015 [2] J. van de Weijer, C. Schmid, J. J. Verbeek, and D. Larlus. "Learning color names for real-world applications." TIP, 18(7):1512–1524, 2009. _____________________________________ Copyright (c) 2014, Tomáš Vojíř Permission to use, copy, modify, and distribute this software for research purposes is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. __________________ Additional Library NOTE: The following files are part of Piotr's Toolbox, and were modified for usage with c++ src/piotr_fhog/gradientMex.cpp src/piotr_fhog/sse.hpp src/piotr_fhog/wrappers.hpp You are encouraged to get the [full version of this library here.](http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html) ______________________________________________________________________________ Copyright (c) 2012, Piotr Dollar All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. The views and conclusions contained in the software and documentation are those of the authors and should not be interpreted as representing official policies, either expressed or implied, of the FreeBSD Project.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值