-
bool, string等变量, 最好明确初始化, 否则, 不同的编译器中, 会出现不同默认初始化值的情况, 如我们的case中 bool变量的值, 未明确初始化时, 初始值竟然是240, 而不是false/true. 贴两个quote According to C++ standard Section 8.5.12:
- if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value, 若不初始化, 值是不确定性的值
- For primitive built-in data types (bool, char, wchar_t, short, int, long, float, double, long double), only global variables (all static storage variables) get default value of zero if they are not explicitly initialized 只有原生的数据类型, 定义为全局的静态变量时, 才会给你初始化.
struct SegResultContext { std::string word; std::vector<int> semantic_ids; bool is_addr_start; bool is_level4area; SegResultContext():word(), semantic_ids(), is_addr_start(false), is_level4area(false){} };
-
replace_all(remove_spaces(trim(text))), 嵌套着写一起不行, 报错, 换成三行来写就OK了, 为什么?
- replace_all, remove_surplus_spaces, trim三个函数的形参, 全部定义成了string &s, 而非const string &s导致的.
- 两者差异在于, 形参中使用引用, 而非常量引用, 会极大地限制函数所能接受的实参类型, 我们不能把const对象, 字面值或者需要类型转换的对象, 函数表达式等传递递给普通的引用形参. 我们嵌套函数, 属于函数表达式, 所以, 将这三个函数的普通引用形参, 前面加个const, 即改为常量引用形参, 就可以多层嵌套调用了.
string rm_surplus_spaces(const string &s) { regex r("\\s+"); string new_s = regex_replace(s, r, " "); return new_s; } string replace_all(const string &s, const string &old_value, const string &new_value) { regex re(old_value); string ss = regex_replace(s, re, new_value); return ss; }
-
代码块的执行时长的测量方法, 利用std::chrono::high_resolution_clock, 单位为micro seconds (us) ,同时chrono类提供了不同时间单位的测量方法, 如ms, us, ns, s等
auto start_micro = std::chrono::high_resolution_clock::now(); ....code_block..... auto ela_s1 = std::chrono::high_resolution_clock::now(); long long cost_s1 = std::chrono::duration_cast<std::chrono::microseconds>(ela_s1-start_micro).count(); cout << "cost in us : " << to_string(cost_s1) << endl;
-
如何正确读取string中的每个字或字符? (当中英混合时, 每个字的长度, 有2-6个byte组成)
bool read_by_word(const std::string &input){ //正常格式的line //const char* line_ch = line_to_char(line); std::string ch; for (size_t i = 0, len = 0; i != input.length(); i += len) { unsigned char byte = (unsigned) input[i]; if (byte >= 0xFC) // length 6 len = 6; else if (byte >= 0xF8) len = 5; else if (byte >= 0xF0) len = 4; else if (byte >= 0xE0) len = 3; else if (byte >= 0xC0) len = 2; else len = 1; ch = input.substr(i, len); cout << ch << endl; } return true; }
-
对vector of 对象做, 基于某个成员变量做排序
struct person { string name; int age; }; inline void sort_vector(std::vector<person> &people) { std::sort(people.begin(), people.end(), [](auto const &a, auto const &b) { return a.age < b.age; }); } int main() { person p1{"john", 22}; person p2{"david", 19}; person p3{"amy", 7}; vector<person> people; people.push_back(p1); people.push_back(p2); people.push_back(p3); sort_vector(people); for (auto v : people) { cout << v.name << " " << v.age << endl; } }
-
正则, 完全匹配, 并打印每个group匹配到的值
string s = "联系人:你好嘛联系电话:242424354354详细地址:浙江省(拒绝到付件!面单上寄件人号码 名字要填写上去哦,另外再注明一下是"; regex re(".*?(收件人|联系人|姓名)[\\s,:]+([^\\s,:]{6,18})[\\s,]*(手机|电话|号码|联系|\\d+|$).*?"); std::smatch what; std::regex_match(s, what, re); for (size_t i = 0; i < what.size(); ++i) { cout << what[i].str() << " " << i << endl; } //或取指定的第几个group if (std::regex_match(s, what, re)) { cout << what[2].str() << endl; }
-
正则, 打印所有满足表达式的值, 即使用regex_search (分别举例返回第一个匹配值和返回每个匹配值)
void test_regex_search() { string pattern("[^c]ei"); pattern = "[[:alpha:]]*" + pattern + "[[:alpha:]]*"; regex r(pattern); smatch results; string test_str = "receipt freind theif receive"; if (regex_search(test_str, results, r)) { for (size_t i = 0; i < results.size(); i++) cout << results[i].str() << endl; } //第一种条理更清晰的写法 sregex_iterator it(test_str.begin(), test_str.end(), r); sregex_iterator it_end; while (it != it_end) { cout << it->str() << endl; ++it; } //第二种相等的简洁写法 for (sregex_iterator it(test_str.begin(), test_str.end(), r), it_end; it != it_end; ++it) cout << it->str() << endl; }
-
读本地文件
std::string file_in = "test.txt"; std::ifstream ifile(file_in.c_str()); if (!ifile.is_open()) { std::cout << "regex model file - open error!" << std::endl; return 1; } std::string line; while (getline(ifile, line)) { if (line.size() <= 0) continue; std::cout << line << std::endl; }
-
利用picojson.h来解析json数据, picojson.h可以搜google或github, 直接拷贝过来即可
#include "picojson.h" int main() { string query = "{\"words_in\":\"c.tb.cn/I3.ZWA4n\", \"id\":\"NEW2\", \"type\":\"text\"}"; map<string, string> features; picojson::value v; std::string err; parse(v, query.begin(), query.end(), &err); if (!err.empty() || !v.is<picojson::object>()) { cout << err.c_str() << endl; return 1; } const picojson::value::object& obj = v.get<picojson::object>(); for (picojson::value::object::const_iterator it = obj.begin(); it != obj.end(); it++) { features[it->first] = it->second.to_str(); cout << it->first << " " << it->second.to_str() << endl; } }
to be continued…