汉明距离是衡量两个等长字符串之间差异的度量指标。它对应于将一个字符串变为另一个字符串所需的最小替换次数,其中替换操作仅改变一个字符。
具体计算汉明距离的方法是将两个字符串按位进行比较,统计不同位上字符的个数。例如,比较字符串"101101"和"100111"的汉明距离:
1 0 1 1 0 1
1 0 0 1 1 1
在第2位和第4位字符不同,因此汉明距离为2。
汉明距离常常用于错误检测和纠错码等领域,例如比较两个二进制数字的相似度或判定两个字符串之间的编辑距离。
示例一:
/**
* @file
* @brief Returns the [Hamming
* distance](https://en.wikipedia.org/wiki/Hamming_distance) between two
* integers
*
* @details
* To find hamming distance between two integers, we take their xor, which will
* have a set bit iff those bits differ in the two numbers.
* Hence, we return the number of such set bits.
*
* @author [Ravishankar Joshi](https://github.com/ravibitsgoa)
*/
#include <cassert> /// for assert
#include <iostream> /// for io operations
/**
* @namespace bit_manipulation
* @brief Bit Manipulation algorithms
*/
namespace bit_manipulation {
/**
* @namespace hamming_distance
* @brief Functions for [Hamming
* distance](https://en.wikipedia.org/wiki/Hamming_distance) implementation
*/
namespace hamming_distance {
/**
* This function returns the number of set bits in the given number.
* @param value the number of which we want to count the number of set bits.
* @returns the number of set bits in the given number.
*/
uint64_t bitCount(uint64_t value) {
uint64_t count = 0;
while (value) { // until all bits are zero
if (value & 1) { // check lower bit
count++;
}
value >>= 1; // shift bits, removing lower bit
}
return count;
}
/**
* This function returns the hamming distance between two integers.
* @param a the first number
* @param b the second number
* @returns the number of bits differing between the two integers.
*/
uint64_t hamming_distance(uint64_t a, uint64_t b) { return bitCount(a ^ b); }
/**
* This function returns the hamming distance between two strings.
* @param a the first string
* @param b the second string
* @returns the number of characters differing between the two strings.
*/
uint64_t hamming_distance(const std::string& a, const std::string& b) {
assert(a.size() == b.size());
size_t n = a.size();
uint64_t count = 0;
for (size_t i = 0; i < n; i++) {
count += (b[i] != a[i]);
}
return count;
}
} // namespace hamming_distance
} // namespace bit_manipulation
/**
* @brief Function to the test hamming distance.
* @returns void
*/
static void test() {
assert(bit_manipulation::hamming_distance::hamming_distance(11, 2) == 2);
assert(bit_manipulation::hamming_distance::hamming_distance(2, 0) == 1);
assert(bit_manipulation::hamming_distance::hamming_distance(11, 0) == 3);
assert(bit_manipulation::hamming_distance::hamming_distance("1101",
"1111") == 1);
assert(bit_manipulation::hamming_distance::hamming_distance("1111",
"1111") == 0);
assert(bit_manipulation::hamming_distance::hamming_distance("0000",
"1111") == 4);
assert(bit_manipulation::hamming_distance::hamming_distance("alpha",
"alphb") == 1);
assert(bit_manipulation::hamming_distance::hamming_distance("abcd",
"abcd") == 0);
assert(bit_manipulation::hamming_distance::hamming_distance("dcba",
"abcd") == 4);
}
/**
* @brief Main function
* @returns 0 on exit
*/
int main() {
test(); // execute the tests
uint64_t a = 11; // 1011 in binary
uint64_t b = 2; // 0010 in binary
std::cout << "Hamming distance between " << a << " and " << b << " is "
<< bit_manipulation::hamming_distance::hamming_distance(a, b)
<< std::endl;
}
示例二:
使用C++实现计算两个字符串的汉明距离的示例代码:
#include <iostream>
#include <string>
int hammingDistance(const std::string& str1, const std::string& str2) {
// 首先确保两个字符串的长度相等
if (str1.length() != str2.length()) {
throw std::runtime_error("The input strings should have the same length.");
}
int distance = 0;
for (int i = 0; i < str1.length(); i++) {
// 比较字符是否相等
if (str1[i] != str2[i]) {
distance++;
}
}
return distance;
}
int main() {
std::string str1 = "101101";
std::string str2 = "100111";
try {
int distance = hammingDistance(str1, str2);
std::cout << "Hamming distance: " << distance << std::endl;
} catch (const std::exception& e) {
std::cout << "Error: " << e.what() << std::endl;
}
return 0;
}
该代码通过传入两个字符串 `str1` 和 `str2`,并使用一个循环遍历两个字符串的每个字符进行比较,统计不同字符的个数作为汉明距离。在运行时,会先检查两个字符串的长度是否相等,若不相等会抛出异常。
运行示例代码,输出结果为:
Hamming distance: 2
这表示字符串 "101101" 和 "100111" 的汉明距离为2。